2
Copyright (c) 1993-2008, Cognitive Technologies
5
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½,
6
ļæ½ļæ½ļæ½ ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½:
8
* ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
9
ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
10
ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½.
11
* ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½
12
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
13
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½
14
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½.
15
* ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ Cognitive Technologies, ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½
16
ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
17
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½, ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
18
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½.
20
ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ "ļæ½ļæ½ļæ½
21
ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½" ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½-ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½,
22
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ ļæ½ļæ½
23
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½. ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
24
ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½
25
ļæ½ļæ½ŃØļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½
26
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
27
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ (ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½,
28
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½-ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½
29
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½/ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½,
30
ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½), ļæ½ļæ½ ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½, ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½
31
ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½ ļæ½ ļæ½ļæ½ļæ½ļæ½ļæ½ļæ½.
33
Redistribution and use in source and binary forms, with or without modification,
34
are permitted provided that the following conditions are met:
36
* Redistributions of source code must retain the above copyright notice,
37
this list of conditions and the following disclaimer.
38
* Redistributions in binary form must reproduce the above copyright notice,
39
this list of conditions and the following disclaimer in the documentation
40
and/or other materials provided with the distribution.
41
* Neither the name of the Cognitive Technologies nor the names of its
42
contributors may be used to endorse or promote products derived from this
43
software without specific prior written permission.
45
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
46
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
47
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
48
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
49
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
50
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
51
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
52
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
53
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
54
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
59
MMX_FUNC(void) MMX_addshab_cykl(int * src, int cg, signed char * dst, int num)
69
paddd mm7, mm6 ;; mm7=((Int32)cg,(Int32)cg)
79
label0:; // process 8 elems
82
pcmpgtb mm6, mm0 ;; signums
83
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
84
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
87
punpckHwd mm0, mm6 ;; a3,a2
88
punpckLwd mm2, mm6 ;; a1,a0
90
pmaddwd mm2, mm7 ;; (a1,a0)*cg
92
pmaddwd mm0, mm7 ;; (a3,a2)*cg
93
punpckHwd mm1, mm6 ;; a7,a6
94
punpckLwd mm3, mm6 ;; a5,a4
95
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
96
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
100
pmaddwd mm3, mm7 ;; (a5,a4)*cg
102
pmaddwd mm1, mm7 ;; (a7,a6)*cg
106
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
107
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
108
pxor mm6, mm6 ;; mm6 = 0
122
MMX_FUNC(void) MMX_addshab(int * src, int cg, signed char * dst)
132
paddd mm7, mm6 // mm7=((Int32)cg,(Int32)cg)
142
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
143
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
146
punpckHwd mm0, mm6 ;; a3,a2
147
punpckLwd mm2, mm6 ;; a1,a0
149
pmaddwd mm2, mm7 ;; (a1,a0)*cg
151
pmaddwd mm0, mm7 ;; (a3,a2)*cg
152
punpckHwd mm1, mm6 ;; a7,a6
153
punpckLwd mm3, mm6 ;; a5,a4
154
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
155
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
159
pmaddwd mm3, mm7 ;; (a5,a4)*cg
161
pmaddwd mm1, mm7 ;; (a7,a6)*cg
165
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
166
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
176
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
177
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
180
punpckHwd mm0, mm6 ;; a3,a2
181
punpckLwd mm2, mm6 ;; a1,a0
183
pmaddwd mm2, mm7 ;; (a1,a0)*cg
185
pmaddwd mm0, mm7 ;; (a3,a2)*cg
186
punpckHwd mm1, mm6 ;; a7,a6
187
punpckLwd mm3, mm6 ;; a5,a4
188
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
189
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
193
pmaddwd mm3, mm7 ;; (a5,a4)*cg
195
pmaddwd mm1, mm7 ;; (a7,a6)*cg
199
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
200
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
210
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
211
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
214
punpckHwd mm0, mm6 ;; a3,a2
215
punpckLwd mm2, mm6 ;; a1,a0
217
pmaddwd mm2, mm7 ;; (a1,a0)*cg
219
pmaddwd mm0, mm7 ;; (a3,a2)*cg
220
punpckHwd mm1, mm6 ;; a7,a6
221
punpckLwd mm3, mm6 ;; a5,a4
222
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
223
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
227
pmaddwd mm3, mm7 ;; (a5,a4)*cg
229
pmaddwd mm1, mm7 ;; (a7,a6)*cg
233
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
234
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
244
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
245
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
248
punpckHwd mm0, mm6 ;; a3,a2
249
punpckLwd mm2, mm6 ;; a1,a0
251
pmaddwd mm2, mm7 ;; (a1,a0)*cg
253
pmaddwd mm0, mm7 ;; (a3,a2)*cg
254
punpckHwd mm1, mm6 ;; a7,a6
255
punpckLwd mm3, mm6 ;; a5,a4
256
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
257
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
261
pmaddwd mm3, mm7 ;; (a5,a4)*cg
263
pmaddwd mm1, mm7 ;; (a7,a6)*cg
267
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
268
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
279
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
280
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
283
punpckHwd mm0, mm6 ;; a3,a2
284
punpckLwd mm2, mm6 ;; a1,a0
286
pmaddwd mm2, mm7 ;; (a1,a0)*cg
288
pmaddwd mm0, mm7 ;; (a3,a2)*cg
289
punpckHwd mm1, mm6 ;; a7,a6
290
punpckLwd mm3, mm6 ;; a5,a4
291
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
292
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
296
pmaddwd mm3, mm7 ;; (a5,a4)*cg
298
pmaddwd mm1, mm7 ;; (a7,a6)*cg
302
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
303
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
313
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
314
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
317
punpckHwd mm0, mm6 ;; a3,a2
318
punpckLwd mm2, mm6 ;; a1,a0
320
pmaddwd mm2, mm7 ;; (a1,a0)*cg
322
pmaddwd mm0, mm7 ;; (a3,a2)*cg
323
punpckHwd mm1, mm6 ;; a7,a6
324
punpckLwd mm3, mm6 ;; a5,a4
325
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
326
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
330
pmaddwd mm3, mm7 ;; (a5,a4)*cg
332
pmaddwd mm1, mm7 ;; (a7,a6)*cg
336
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
337
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
347
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
348
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
351
punpckHwd mm0, mm6 ;; a3,a2
352
punpckLwd mm2, mm6 ;; a1,a0
354
pmaddwd mm2, mm7 ;; (a1,a0)*cg
356
pmaddwd mm0, mm7 ;; (a3,a2)*cg
357
punpckHwd mm1, mm6 ;; a7,a6
358
punpckLwd mm3, mm6 ;; a5,a4
359
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
360
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
364
pmaddwd mm3, mm7 ;; (a5,a4)*cg
366
pmaddwd mm1, mm7 ;; (a7,a6)*cg
370
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
371
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
381
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
382
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
385
punpckHwd mm0, mm6 ;; a3,a2
386
punpckLwd mm2, mm6 ;; a1,a0
388
pmaddwd mm2, mm7 ;; (a1,a0)*cg
390
pmaddwd mm0, mm7 ;; (a3,a2)*cg
391
punpckHwd mm1, mm6 ;; a7,a6
392
punpckLwd mm3, mm6 ;; a5,a4
393
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
394
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
398
pmaddwd mm3, mm7 ;; (a5,a4)*cg
400
pmaddwd mm1, mm7 ;; (a7,a6)*cg
404
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
405
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
416
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
417
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
420
punpckHwd mm0, mm6 ;; a3,a2
421
punpckLwd mm2, mm6 ;; a1,a0
423
pmaddwd mm2, mm7 ;; (a1,a0)*cg
425
pmaddwd mm0, mm7 ;; (a3,a2)*cg
426
punpckHwd mm1, mm6 ;; a7,a6
427
punpckLwd mm3, mm6 ;; a5,a4
428
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
429
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
433
pmaddwd mm3, mm7 ;; (a5,a4)*cg
435
pmaddwd mm1, mm7 ;; (a7,a6)*cg
439
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
440
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
450
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
451
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
454
punpckHwd mm0, mm6 ;; a3,a2
455
punpckLwd mm2, mm6 ;; a1,a0
457
pmaddwd mm2, mm7 ;; (a1,a0)*cg
459
pmaddwd mm0, mm7 ;; (a3,a2)*cg
460
punpckHwd mm1, mm6 ;; a7,a6
461
punpckLwd mm3, mm6 ;; a5,a4
462
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
463
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
467
pmaddwd mm3, mm7 ;; (a5,a4)*cg
469
pmaddwd mm1, mm7 ;; (a7,a6)*cg
473
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
474
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
484
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
485
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
488
punpckHwd mm0, mm6 ;; a3,a2
489
punpckLwd mm2, mm6 ;; a1,a0
491
pmaddwd mm2, mm7 ;; (a1,a0)*cg
493
pmaddwd mm0, mm7 ;; (a3,a2)*cg
494
punpckHwd mm1, mm6 ;; a7,a6
495
punpckLwd mm3, mm6 ;; a5,a4
496
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
497
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
501
pmaddwd mm3, mm7 ;; (a5,a4)*cg
503
pmaddwd mm1, mm7 ;; (a7,a6)*cg
507
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
508
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
518
punpckHbw mm1, mm6 ;; a7,a6,a5,a4
519
punpckLbw mm0, mm6 ;; a3,a2,a1,a0
522
punpckHwd mm0, mm6 ;; a3,a2
523
punpckLwd mm2, mm6 ;; a1,a0
525
pmaddwd mm2, mm7 ;; (a1,a0)*cg
527
pmaddwd mm0, mm7 ;; (a3,a2)*cg
528
punpckHwd mm1, mm6 ;; a7,a6
529
punpckLwd mm3, mm6 ;; a5,a4
530
paddd mm4, mm2 ;; (g1,g0)+=(a1,a0)*cg
531
paddd mm5, mm0 ;; (g3,g2)+=(a3,a2)*cg
535
pmaddwd mm3, mm7 ;; (a5,a4)*cg
537
pmaddwd mm1, mm7 ;; (a7,a6)*cg
541
paddd mm4, mm3 ;; (g5,g4)+=(a5,a4)*cg
542
paddd mm5, mm1 ;; (g7,g6)+=(a7,a6)*cg
548
// last portion 48 = 4*12