1
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
2
index c3014df..299615d 100644
3
--- a/Documentation/00-INDEX
4
+++ b/Documentation/00-INDEX
5
@@ -262,8 +262,6 @@ mtrr.txt
6
- how to use PPro Memory Type Range Registers to increase performance.
8
- info on the generic mutex subsystem.
10
- - directory with various information about namespaces
12
- info on a TCP implementation of a network block device.
14
diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
15
index 4953bc2..054a7ec 100644
16
--- a/Documentation/DocBook/Makefile
17
+++ b/Documentation/DocBook/Makefile
18
@@ -11,7 +11,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
19
procfs-guide.xml writing_usb_driver.xml \
20
kernel-api.xml filesystems.xml lsm.xml usb.xml \
21
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
22
- genericirq.xml s390-drivers.xml uio-howto.xml
23
+ genericirq.xml s390-drivers.xml
26
# The build process is as follows (targets):
27
diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
28
index fdd7f4f..c119484 100644
29
--- a/Documentation/DocBook/uio-howto.tmpl
30
+++ b/Documentation/DocBook/uio-howto.tmpl
35
- <revnumber>0.4</revnumber>
36
- <date>2007-11-26</date>
37
- <authorinitials>hjk</authorinitials>
38
- <revremark>Removed section about uio_dummy.</revremark>
41
<revnumber>0.3</revnumber>
42
<date>2007-04-29</date>
43
<authorinitials>hjk</authorinitials>
44
@@ -100,26 +94,6 @@ interested in translating it, please email me
45
user space. This simplifies development and reduces the risk of
46
serious bugs within a kernel module.
49
- Please note that UIO is not an universal driver interface. Devices
50
- that are already handled well by other kernel subsystems (like
51
- networking or serial or USB) are no candidates for an UIO driver.
52
- Hardware that is ideally suited for an UIO driver fulfills all of
57
- <para>The device has memory that can be mapped. The device can be
58
- controlled completely by writing to this memory.</para>
61
- <para>The device usually generates interrupts.</para>
64
- <para>The device does not fit into one of the standard kernel
71
@@ -200,9 +174,8 @@ interested in translating it, please email me
72
For cards that don't generate interrupts but need to be
73
polled, there is the possibility to set up a timer that
74
triggers the interrupt handler at configurable time intervals.
75
- This interrupt simulation is done by calling
76
- <function>uio_event_notify()</function>
77
- from the timer's event handler.
78
+ See <filename>drivers/uio/uio_dummy.c</filename> for an
79
+ example of this technique.
83
@@ -290,11 +263,63 @@ offset = N * getpagesize();
87
+<chapter id="using-uio_dummy" xreflabel="Using uio_dummy">
88
+<?dbhtml filename="using-uio_dummy.html"?>
89
+<title>Using uio_dummy</title>
91
+ Well, there is no real use for uio_dummy. Its only purpose is
92
+ to test most parts of the UIO system (everything except
93
+ hardware interrupts), and to serve as an example for the
94
+ kernel module that you will have to write yourself.
97
+<sect1 id="what_uio_dummy_does">
98
+<title>What uio_dummy does</title>
100
+ The kernel module <filename>uio_dummy.ko</filename> creates a
101
+ device that uses a timer to generate periodic interrupts. The
102
+ interrupt handler does nothing but increment a counter. The
103
+ driver adds two custom attributes, <varname>count</varname>
104
+ and <varname>freq</varname>, that appear under
105
+ <filename>/sys/devices/platform/uio_dummy/</filename>.
109
+ The attribute <varname>count</varname> can be read and
110
+ written. The associated file
111
+ <filename>/sys/devices/platform/uio_dummy/count</filename>
112
+ appears as a normal text file and contains the total number of
113
+ timer interrupts. If you look at it (e.g. using
114
+ <function>cat</function>), you'll notice it is slowly counting
119
+ The attribute <varname>freq</varname> can be read and written.
121
+ <filename>/sys/devices/platform/uio_dummy/freq</filename>
122
+ represents the number of system timer ticks between two timer
123
+ interrupts. The default value of <varname>freq</varname> is
124
+ the value of the kernel variable <varname>HZ</varname>, which
125
+ gives you an interval of one second. Lower values will
126
+ increase the frequency. Try the following:
128
+<programlisting format="linespecific">
129
+cd /sys/devices/platform/uio_dummy/
133
+ Use <function>cat count</function> to see how the interrupt
139
<chapter id="custom_kernel_module" xreflabel="Writing your own kernel module">
140
<?dbhtml filename="custom_kernel_module.html"?>
141
<title>Writing your own kernel module</title>
143
- Please have a look at <filename>uio_cif.c</filename> as an
144
+ Please have a look at <filename>uio_dummy.c</filename> as an
145
example. The following paragraphs explain the different
146
sections of this file.
148
@@ -329,8 +354,9 @@ See the description below for details.
149
interrupt, it's your modules task to determine the irq number during
150
initialization. If you don't have a hardware generated interrupt but
151
want to trigger the interrupt handler in some other way, set
152
-<varname>irq</varname> to <varname>UIO_IRQ_CUSTOM</varname>.
153
-If you had no interrupt at all, you could set
154
+<varname>irq</varname> to <varname>UIO_IRQ_CUSTOM</varname>. The
155
+uio_dummy module does this as it triggers the event mechanism in a timer
156
+routine. If you had no interrupt at all, you could set
157
<varname>irq</varname> to <varname>UIO_IRQ_NONE</varname>, though this
160
diff --git a/Documentation/namespaces/compatibility-list.txt b/Documentation/namespaces/compatibility-list.txt
161
deleted file mode 100644
162
index defc558..0000000
163
--- a/Documentation/namespaces/compatibility-list.txt
166
- Namespaces compatibility list
168
-This document contains the information about the problems user
169
-may have when creating tasks living in different namespaces.
171
-Here's the summary. This matrix shows the known problems, that
172
-occur when tasks share some namespace (the columns) while living
173
-in different other namespaces (the rows):
175
- UTS IPC VFS PID User Net
183
-1. Both the IPC and the PID namespaces provide IDs to address
184
- object inside the kernel. E.g. semaphore with IPCID or
185
- process group with pid.
187
- In both cases, tasks shouldn't try exposing this ID to some
188
- other task living in a different namespace via a shared filesystem
189
- or IPC shmem/message. The fact is that this ID is only valid
190
- within the namespace it was obtained in and may refer to some
191
- other object in another namespace.
193
-2. Intentionally, two equal user IDs in different user namespaces
194
- should not be equal from the VFS point of view. In other
195
- words, user 10 in one user namespace shouldn't have the same
196
- access permissions to files, belonging to user 10 in another
199
- The same is true for the IPC namespaces being shared - two users
200
- from different user namespaces should not access the same IPC objects
201
- even having equal UIDs.
203
- But currently this is not so.
205
diff --git a/Documentation/tty.txt b/Documentation/tty.txt
206
index 8e65c44..048a876 100644
207
--- a/Documentation/tty.txt
208
+++ b/Documentation/tty.txt
209
@@ -132,14 +132,6 @@ set_termios() Notify the tty driver that the device's termios
210
tty->termios. Previous settings should be passed in
213
- The API is defined such that the driver should return
214
- the actual modes selected. This means that the
215
- driver function is responsible for modifying any
216
- bits in the request it cannot fulfill to indicate
217
- the actual modes being used. A device with no
218
- hardware capability for change (eg a USB dongle or
219
- virtual port) can provide NULL for this method.
221
throttle() Notify the tty driver that input buffers for the
222
line discipline are close to full, and it should
223
somehow signal that no more characters should be
224
diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt
225
index b2fc4d4..97842de 100644
226
--- a/Documentation/usb/power-management.txt
227
+++ b/Documentation/usb/power-management.txt
228
@@ -278,14 +278,6 @@ optional. The methods' jobs are quite simple:
229
(although the interfaces will be in the same altsettings as
232
-If the device is disconnected or powered down while it is suspended,
233
-the disconnect method will be called instead of the resume or
234
-reset_resume method. This is also quite likely to happen when
235
-waking up from hibernation, as many systems do not maintain suspend
236
-current to the USB host controllers during hibernation. (It's
237
-possible to work around the hibernation-forces-disconnect problem by
238
-using the USB Persist facility.)
240
The reset_resume method is used by the USB Persist facility (see
241
Documentation/usb/persist.txt) and it can also be used under certain
242
circumstances when CONFIG_USB_PERSIST is not enabled. Currently, if a
243
diff --git a/Documentation/x86_64/uefi.txt b/Documentation/x86_64/uefi.txt
244
deleted file mode 100644
245
index 91a98ed..0000000
246
--- a/Documentation/x86_64/uefi.txt
249
-General note on [U]EFI x86_64 support
250
--------------------------------------
252
-The nomenclature EFI and UEFI are used interchangeably in this document.
254
-Although the tools below are _not_ needed for building the kernel,
255
-the needed bootloader support and associated tools for x86_64 platforms
256
-with EFI firmware and specifications are listed below.
258
-1. UEFI specification: http://www.uefi.org
260
-2. Booting Linux kernel on UEFI x86_64 platform requires bootloader
261
- support. Elilo with x86_64 support can be used.
263
-3. x86_64 platform with EFI/UEFI firmware.
267
-- Build the kernel with the following configuration.
269
- CONFIG_FRAMEBUFFER_CONSOLE=y
270
-- Create a VFAT partition on the disk
271
-- Copy the following to the VFAT partition:
272
- elilo bootloader with x86_64 support, elilo configuration file,
273
- kernel image built in first step and corresponding
274
- initrd. Instructions on building elilo and its dependencies
275
- can be found in the elilo sourceforge project.
276
-- Boot to EFI shell and invoke elilo choosing the kernel image built
278
diff --git a/MAINTAINERS b/MAINTAINERS
279
index 7c8392e..f5bd9ba 100644
282
@@ -323,7 +323,8 @@ S: Maintained
283
ALCATEL SPEEDTOUCH USB DRIVER
285
M: duncan.sands@free.fr
286
-L: linux-usb@vger.kernel.org
287
+L: linux-usb-users@lists.sourceforge.net
288
+L: linux-usb-devel@lists.sourceforge.net
289
W: http://www.linux-usb.org/SpeedTouch/
292
@@ -439,7 +440,7 @@ S: Maintained
294
ARM/ATMEL AT91RM9200 ARM ARCHITECTURE
296
-M: linux@maxim.org.za
297
+M: andrew@sanpeople.com
298
L: linux-arm-kernel@lists.arm.linux.org.uk (subscribers-only)
299
W: http://maxim.org.za/at91_26.html
301
@@ -1042,7 +1043,7 @@ S: Maintained
302
CIRRUS LOGIC EP93XX OHCI USB HOST DRIVER
304
M: kernel@wantstofly.org
305
-L: linux-usb@vger.kernel.org
306
+L: linux-usb-devel@lists.sourceforge.net
309
CIRRUS LOGIC CS4280/CS461x SOUNDDRIVER
310
@@ -1551,7 +1552,7 @@ S: Maintained
311
FREESCALE HIGHSPEED USB DEVICE DRIVER
313
M: leoli@freescale.com
314
-L: linux-usb@vger.kernel.org
315
+L: linux-usb-devel@lists.sourceforge.net
316
L: linuxppc-dev@ozlabs.org
319
@@ -2110,14 +2111,6 @@ L: irda-users@lists.sourceforge.net (subscribers-only)
320
W: http://irda.sourceforge.net/
325
-M: michaelc@cs.wisc.edu
326
-L: open-iscsi@googlegroups.com
327
-W: www.open-iscsi.org
328
-T: git kernel.org:/pub/scm/linux/kernel/mnc/linux-2.6-iscsi.git
334
@@ -3817,20 +3810,22 @@ S: Maintained
337
M: oliver@neukum.name
338
-L: linux-usb@vger.kernel.org
339
+L: linux-usb-users@lists.sourceforge.net
340
+L: linux-usb-devel@lists.sourceforge.net
343
USB BLOCK DRIVER (UB ub)
345
M: zaitcev@redhat.com
346
L: linux-kernel@vger.kernel.org
347
-L: linux-usb@vger.kernel.org
348
+L: linux-usb-devel@lists.sourceforge.net
351
USB CDC ETHERNET DRIVER
352
P: Greg Kroah-Hartman
354
-L: linux-usb@vger.kernel.org
355
+L: linux-usb-users@lists.sourceforge.net
356
+L: linux-usb-devel@lists.sourceforge.net
358
W: http://www.kroah.com/linux-usb/
360
@@ -3844,13 +3839,13 @@ S: Maintained
363
M: dbrownell@users.sourceforge.net
364
-L: linux-usb@vger.kernel.org
365
+L: linux-usb-devel@lists.sourceforge.net
368
USB ET61X[12]51 DRIVER
370
M: luca.risolia@studio.unibo.it
371
-L: linux-usb@vger.kernel.org
372
+L: linux-usb-devel@lists.sourceforge.net
373
L: video4linux-list@redhat.com
374
W: http://www.linux-projects.org
376
@@ -3858,33 +3853,41 @@ S: Maintained
377
USB GADGET/PERIPHERAL SUBSYSTEM
379
M: dbrownell@users.sourceforge.net
380
-L: linux-usb@vger.kernel.org
381
+L: linux-usb-devel@lists.sourceforge.net
382
W: http://www.linux-usb.org/gadget
385
USB HID/HIDBP DRIVERS (USB KEYBOARDS, MICE, REMOTE CONTROLS, ...)
388
-L: linux-usb@vger.kernel.org
389
+L: linux-usb-devel@lists.sourceforge.net
390
T: git kernel.org:/pub/scm/linux/kernel/git/jikos/hid.git
395
+M: johannes@erdfelt.com
396
+L: linux-usb-users@lists.sourceforge.net
397
+L: linux-usb-devel@lists.sourceforge.net
403
-L: linux-usb@vger.kernel.org
404
+L: linux-usb-devel@lists.sourceforge.net
407
USB KAWASAKI LSI DRIVER
409
M: oliver@neukum.name
410
-L: linux-usb@vger.kernel.org
411
+L: linux-usb-users@lists.sourceforge.net
412
+L: linux-usb-devel@lists.sourceforge.net
415
USB MASS STORAGE DRIVER
417
M: mdharm-usb@one-eyed-alien.net
418
-L: linux-usb@vger.kernel.org
419
+L: linux-usb-users@lists.sourceforge.net
420
L: usb-storage@lists.one-eyed-alien.net
422
W: http://www.one-eyed-alien.net/~mdharm/linux-usb/
423
@@ -3892,26 +3895,28 @@ W: http://www.one-eyed-alien.net/~mdharm/linux-usb/
426
M: dbrownell@users.sourceforge.net
427
-L: linux-usb@vger.kernel.org
428
+L: linux-usb-users@lists.sourceforge.net
429
+L: linux-usb-devel@lists.sourceforge.net
432
USB OPTION-CARD DRIVER
434
M: smurf@smurf.noris.de
435
-L: linux-usb@vger.kernel.org
436
+L: linux-usb-devel@lists.sourceforge.net
441
M: mmcclell@bigfoot.com
442
-L: linux-usb@vger.kernel.org
443
+L: linux-usb-users@lists.sourceforge.net
444
+L: linux-usb-devel@lists.sourceforge.net
445
W: http://alpha.dyndns.org/ov511/
450
M: petkan@users.sourceforge.net
451
-L: linux-usb@vger.kernel.org
452
+L: linux-usb-devel@lists.sourceforge.net
453
L: netdev@vger.kernel.org
454
W: http://pegasus2.sourceforge.net/
456
@@ -3919,13 +3924,14 @@ S: Maintained
457
USB PRINTER DRIVER (usblp)
459
M: zaitcev@redhat.com
460
-L: linux-usb@vger.kernel.org
461
+L: linux-usb-users@lists.sourceforge.net
462
+L: linux-usb-devel@lists.sourceforge.net
467
M: petkan@users.sourceforge.net
468
-L: linux-usb@vger.kernel.org
469
+L: linux-usb-devel@lists.sourceforge.net
470
L: netdev@vger.kernel.org
471
W: http://pegasus2.sourceforge.net/
473
@@ -3933,7 +3939,8 @@ S: Maintained
477
-L: linux-usb@vger.kernel.org
478
+L: linux-usb-users@lists.sourceforge.net
479
+L: linux-usb-devel@lists.sourceforge.net
480
W: http://www.chello.nl/~j.vreeken/se401/
483
@@ -3947,59 +3954,72 @@ USB SERIAL DIGI ACCELEPORT DRIVER
484
P: Peter Berger and Al Borchers
485
M: pberger@brimson.com
486
M: alborchers@steinerpoint.com
487
-L: linux-usb@vger.kernel.org
488
+L: linux-usb-users@lists.sourceforge.net
489
+L: linux-usb-devel@lists.sourceforge.net
493
P: Greg Kroah-Hartman
495
-L: linux-usb@vger.kernel.org
496
+L: linux-usb-users@lists.sourceforge.net
497
+L: linux-usb-devel@lists.sourceforge.net
500
USB SERIAL BELKIN F5U103 DRIVER
501
P: William Greathouse
502
M: wgreathouse@smva.com
503
-L: linux-usb@vger.kernel.org
504
+L: linux-usb-users@lists.sourceforge.net
505
+L: linux-usb-devel@lists.sourceforge.net
508
USB SERIAL CYPRESS M8 DRIVER
511
-L: linux-usb@vger.kernel.org
512
+L: linux-usb-users@lists.sourceforge.net
513
+L: linux-usb-devel@lists.sourceforge.net
515
W: http://geocities.com/i0xox0i
516
W: http://firstlight.net/cvs
518
+USB SERIAL CYBERJACK PINPAD/E-COM DRIVER
519
+L: linux-usb-users@lists.sourceforge.net
520
+L: linux-usb-devel@lists.sourceforge.net
525
M: wolfgang@iksw-muees.de
526
-L: linux-usb@vger.kernel.org
527
+L: linux-usb-users@lists.sourceforge.net
528
+L: linux-usb-devel@lists.sourceforge.net
531
USB SERIAL EMPEG EMPEG-CAR MARK I/II DRIVER
533
M: xavyer@ix.netcom.com
534
-L: linux-usb@vger.kernel.org
535
+L: linux-usb-users@lists.sourceforge.net
536
+L: linux-usb-devel@lists.sourceforge.net
539
USB SERIAL KEYSPAN DRIVER
540
P: Greg Kroah-Hartman
542
-L: linux-usb@vger.kernel.org
543
+L: linux-usb-users@lists.sourceforge.net
544
+L: linux-usb-devel@lists.sourceforge.net
545
W: http://www.kroah.com/linux/
548
USB SERIAL WHITEHEAT DRIVER
549
P: Support Department
550
M: support@connecttech.com
551
-L: linux-usb@vger.kernel.org
552
+L: linux-usb-users@lists.sourceforge.net
553
+L: linux-usb-devel@lists.sourceforge.net
554
W: http://www.connecttech.com
559
M: luca.risolia@studio.unibo.it
560
-L: linux-usb@vger.kernel.org
561
+L: linux-usb-devel@lists.sourceforge.net
562
L: video4linux-list@redhat.com
563
W: http://www.linux-projects.org
565
@@ -4007,7 +4027,8 @@ S: Maintained
567
P: Greg Kroah-Hartman
569
-L: linux-usb@vger.kernel.org
570
+L: linux-usb-users@lists.sourceforge.net
571
+L: linux-usb-devel@lists.sourceforge.net
572
W: http://www.linux-usb.org
573
T: quilt kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/
575
@@ -4015,7 +4036,8 @@ S: Supported
578
M: stern@rowland.harvard.edu
579
-L: linux-usb@vger.kernel.org
580
+L: linux-usb-users@lists.sourceforge.net
581
+L: linux-usb-devel@lists.sourceforge.net
584
USB "USBNET" DRIVER FRAMEWORK
585
@@ -4028,7 +4050,7 @@ S: Maintained
586
USB W996[87]CF DRIVER
588
M: luca.risolia@studio.unibo.it
589
-L: linux-usb@vger.kernel.org
590
+L: linux-usb-devel@lists.sourceforge.net
591
L: video4linux-list@redhat.com
592
W: http://www.linux-projects.org
594
@@ -4036,7 +4058,7 @@ S: Maintained
597
M: luca.risolia@studio.unibo.it
598
-L: linux-usb@vger.kernel.org
599
+L: linux-usb-devel@lists.sourceforge.net
600
L: video4linux-list@redhat.com
601
W: http://www.linux-projects.org
603
@@ -4044,14 +4066,15 @@ S: Maintained
607
-L: linux-usb@vger.kernel.org
608
+L: linux-usb-users@lists.sourceforge.net
609
+L: linux-usb-devel@lists.sourceforge.net
610
W: http://linux-lc100020.sourceforge.net
616
-L: linux-usb@vger.kernel.org
617
+L: linux-usb-devel@lists.sourceforge.net
618
L: video4linux-list@redhat.com
619
W: http://royale.zerezo.com/zr364xx/
621
diff --git a/arch/alpha/kernel/pci-noop.c b/arch/alpha/kernel/pci-noop.c
622
index 468b76c..174b729 100644
623
--- a/arch/alpha/kernel/pci-noop.c
624
+++ b/arch/alpha/kernel/pci-noop.c
626
#include <linux/errno.h>
627
#include <linux/sched.h>
628
#include <linux/dma-mapping.h>
629
-#include <linux/scatterlist.h>
633
@@ -173,19 +172,18 @@ dma_alloc_coherent(struct device *dev, size_t size,
634
EXPORT_SYMBOL(dma_alloc_coherent);
637
-dma_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
638
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
639
enum dma_data_direction direction)
642
- struct scatterlist *sg;
644
- for_each_sg(sgl, sg, nents, i) {
645
+ for (i = 0; i < nents; i++ ) {
648
- BUG_ON(!sg_page(sg));
650
- sg_dma_address(sg) = (dma_addr_t)virt_to_bus(va);
651
- sg_dma_len(sg) = sg->length;
652
+ BUG_ON(!sg[i].page);
653
+ va = page_address(sg[i].page) + sg[i].offset;
654
+ sg_dma_address(sg + i) = (dma_addr_t)virt_to_bus(va);
655
+ sg_dma_len(sg + i) = sg[i].length;
659
diff --git a/arch/arm/common/uengine.c b/arch/arm/common/uengine.c
660
index 117cab3..95c8508 100644
661
--- a/arch/arm/common/uengine.c
662
+++ b/arch/arm/common/uengine.c
663
@@ -374,8 +374,8 @@ static int set_initial_registers(int uengine, struct ixp2000_uengine_code *c)
667
- gpr_a = kzalloc(128 * sizeof(u32), GFP_KERNEL);
668
- gpr_b = kzalloc(128 * sizeof(u32), GFP_KERNEL);
669
+ gpr_a = kmalloc(128 * sizeof(u32), GFP_KERNEL);
670
+ gpr_b = kmalloc(128 * sizeof(u32), GFP_KERNEL);
671
ucode = kmalloc(513 * 5, GFP_KERNEL);
672
if (gpr_a == NULL || gpr_b == NULL || ucode == NULL) {
674
@@ -388,6 +388,8 @@ static int set_initial_registers(int uengine, struct ixp2000_uengine_code *c)
675
if (c->uengine_parameters & IXP2000_UENGINE_4_CONTEXTS)
678
+ memset(gpr_a, 0, sizeof(gpr_a));
679
+ memset(gpr_b, 0, sizeof(gpr_b));
680
for (i = 0; i < 256; i++) {
681
struct ixp2000_reg_value *r = c->initial_reg_values + i;
683
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
684
index 29dec08..d645897 100644
685
--- a/arch/arm/kernel/entry-armv.S
686
+++ b/arch/arm/kernel/entry-armv.S
687
@@ -339,6 +339,16 @@ __pabt_svc:
688
str r1, [sp] @ save the "real" r0 copied
689
@ from the exception stack
691
+#if __LINUX_ARM_ARCH__ < 6 && !defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)
693
+#warning "NPTL on non MMU needs fixing"
695
+ @ make sure our user space atomic helper is aborted
697
+ bichs r3, r3, #PSR_Z_BIT
702
@ We are now ready to fill in the remaining blanks on the stack:
704
@@ -362,25 +372,9 @@ __pabt_svc:
708
- .macro kuser_cmpxchg_check
709
-#if __LINUX_ARM_ARCH__ < 6 && !defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)
711
-#warning "NPTL on non MMU needs fixing"
713
- @ Make sure our user space atomic helper is restarted
714
- @ if it was interrupted in a critical region. Here we
715
- @ perform a quick test inline since it should be false
716
- @ 99.9999% of the time. The rest is done out of line.
718
- blhs kuser_cmpxchg_fixup
726
- kuser_cmpxchg_check
729
@ Call the processor-specific abort handler:
730
@@ -410,7 +404,6 @@ __dabt_usr:
734
- kuser_cmpxchg_check
736
#ifdef CONFIG_TRACE_IRQFLAGS
737
bl trace_hardirqs_off
738
@@ -453,9 +446,9 @@ __und_usr:
743
adr r9, ret_from_exception
744
adr lr, __und_usr_unknown
747
@ fallthrough to call_fpe
749
@@ -676,7 +669,7 @@ __kuser_helper_start:
754
+ * the Z flag might be lost
756
* Definition and user space usage example:
758
@@ -737,6 +730,9 @@ __kuser_memory_barrier: @ 0xffff0fa0
760
* - This routine already includes memory barriers as needed.
762
+ * - A failure might be transient, i.e. it is possible, although unlikely,
763
+ * that "failure" be returned even if *ptr == oldval.
765
* For example, a user space atomic_add implementation could look like this:
767
* #define atomic_add(ptr, val) \
768
@@ -773,62 +769,46 @@ __kuser_cmpxchg: @ 0xffff0fc0
770
#elif __LINUX_ARM_ARCH__ < 6
775
- * The only thing that can break atomicity in this cmpxchg
776
- * implementation is either an IRQ or a data abort exception
777
- * causing another process/thread to be scheduled in the middle
778
- * of the critical sequence. To prevent this, code is added to
779
- * the IRQ and data abort exception handlers to set the pc back
780
- * to the beginning of the critical section if it is found to be
781
- * within that critical section (see kuser_cmpxchg_fixup).
782
+ * Theory of operation:
784
+ * We set the Z flag before loading oldval. If ever an exception
785
+ * occurs we can not be sure the loaded value will still be the same
786
+ * when the exception returns, therefore the user exception handler
787
+ * will clear the Z flag whenever the interrupted user code was
788
+ * actually from the kernel address space (see the usr_entry macro).
790
+ * The post-increment on the str is used to prevent a race with an
791
+ * exception happening just after the str instruction which would
792
+ * clear the Z flag although the exchange was done.
794
-1: ldr r3, [r2] @ load current val
795
- subs r3, r3, r0 @ compare with oldval
796
-2: streq r1, [r2] @ store newval if eq
797
- rsbs r0, r3, #0 @ set return val and C flag
801
-kuser_cmpxchg_fixup:
802
- @ Called from kuser_cmpxchg_check macro.
803
- @ r2 = address of interrupted insn (must be preserved).
804
- @ sp = saved regs. r7 and r8 are clobbered.
805
- @ 1b = first critical insn, 2b = last critical insn.
806
- @ If r2 >= 1b and r2 <= 2b then saved pc_usr is set to 1b.
807
- mov r7, #0xffff0fff
808
- sub r7, r7, #(0xffff0fff - (0xffff0fc0 + (1b - __kuser_cmpxchg)))
810
- rsbcss r8, r8, #(2b - 1b)
811
- strcs r7, [sp, #S_PC]
816
+ teq ip, ip @ set Z flag
817
+ ldr ip, [r2] @ load current val
818
+ add r3, r2, #1 @ prepare store ptr
819
+ teqeq ip, r0 @ compare with oldval if still allowed
820
+ streq r1, [r3, #-1]! @ store newval if still allowed
821
+ subs r0, r2, r3 @ if r2 == r3 the str occured
823
#warning "NPTL on non MMU needs fixing"
833
mcr p15, 0, r0, c7, c10, 5 @ dmb
842
- /* beware -- each __kuser slot must be 8 instructions max */
844
- b __kuser_memory_barrier
847
+ mcr p15, 0, r0, c7, c10, 5 @ dmb
853
@@ -849,7 +829,7 @@ kuser_cmpxchg_fixup:
858
+ * the Z flag might be lost
860
* Definition and user space usage example:
862
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
863
index c34db4e..4764bd9 100644
864
--- a/arch/arm/kernel/traps.c
865
+++ b/arch/arm/kernel/traps.c
866
@@ -327,7 +327,7 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
867
if ((instr & hook->instr_mask) == hook->instr_val &&
868
(regs->ARM_cpsr & hook->cpsr_mask) == hook->cpsr_val) {
869
if (hook->fn(regs, instr) == 0) {
870
- spin_unlock_irqrestore(&undef_lock, flags);
871
+ spin_unlock_irq(&undef_lock);
875
@@ -509,7 +509,7 @@ asmlinkage int arm_syscall(int no, struct pt_regs *regs)
876
* existence. Don't ever use this from user code.
881
extern void do_DataAbort(unsigned long addr, unsigned int fsr,
882
struct pt_regs *regs);
884
@@ -545,6 +545,7 @@ asmlinkage int arm_syscall(int no, struct pt_regs *regs)
885
up_read(&mm->mmap_sem);
886
/* simulate a write access fault */
887
do_DataAbort(addr, 15 + (1 << 11), regs);
892
diff --git a/arch/arm/mach-at91/at91rm9200_devices.c b/arch/arm/mach-at91/at91rm9200_devices.c
893
index 9296833..0417c16 100644
894
--- a/arch/arm/mach-at91/at91rm9200_devices.c
895
+++ b/arch/arm/mach-at91/at91rm9200_devices.c
897
#include <asm/mach/map.h>
899
#include <linux/platform_device.h>
900
-#include <linux/i2c-gpio.h>
902
#include <asm/arch/board.h>
903
#include <asm/arch/gpio.h>
904
@@ -436,40 +435,7 @@ void __init at91_add_device_nand(struct at91_nand_data *data) {}
906
* -------------------------------------------------------------------- */
909
- * Prefer the GPIO code since the TWI controller isn't robust
910
- * (gets overruns and underruns under load) and can only issue
911
- * repeated STARTs in one scenario (the driver doesn't yet handle them).
913
-#if defined(CONFIG_I2C_GPIO) || defined(CONFIG_I2C_GPIO_MODULE)
915
-static struct i2c_gpio_platform_data pdata = {
916
- .sda_pin = AT91_PIN_PA25,
917
- .sda_is_open_drain = 1,
918
- .scl_pin = AT91_PIN_PA26,
919
- .scl_is_open_drain = 1,
920
- .udelay = 2, /* ~100 kHz */
923
-static struct platform_device at91rm9200_twi_device = {
924
- .name = "i2c-gpio",
926
- .dev.platform_data = &pdata,
929
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
931
- at91_set_GPIO_periph(AT91_PIN_PA25, 1); /* TWD (SDA) */
932
- at91_set_multi_drive(AT91_PIN_PA25, 1);
934
- at91_set_GPIO_periph(AT91_PIN_PA26, 1); /* TWCK (SCL) */
935
- at91_set_multi_drive(AT91_PIN_PA26, 1);
937
- i2c_register_board_info(0, devices, nr_devices);
938
- platform_device_register(&at91rm9200_twi_device);
941
-#elif defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
942
+#if defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
944
static struct resource twi_resources[] = {
946
@@ -491,7 +457,7 @@ static struct platform_device at91rm9200_twi_device = {
947
.num_resources = ARRAY_SIZE(twi_resources),
950
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
951
+void __init at91_add_device_i2c(void)
953
/* pins used for TWI interface */
954
at91_set_A_periph(AT91_PIN_PA25, 0); /* TWD */
955
@@ -500,11 +466,10 @@ void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
956
at91_set_A_periph(AT91_PIN_PA26, 0); /* TWCK */
957
at91_set_multi_drive(AT91_PIN_PA26, 1);
959
- i2c_register_board_info(0, devices, nr_devices);
960
platform_device_register(&at91rm9200_twi_device);
963
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices) {}
964
+void __init at91_add_device_i2c(void) {}
968
diff --git a/arch/arm/mach-at91/at91sam9260_devices.c b/arch/arm/mach-at91/at91sam9260_devices.c
969
index 3091bf4..ffd3154 100644
970
--- a/arch/arm/mach-at91/at91sam9260_devices.c
971
+++ b/arch/arm/mach-at91/at91sam9260_devices.c
973
#include <asm/mach/map.h>
975
#include <linux/platform_device.h>
976
-#include <linux/i2c-gpio.h>
978
#include <asm/arch/board.h>
979
#include <asm/arch/gpio.h>
980
@@ -353,41 +352,7 @@ void __init at91_add_device_nand(struct at91_nand_data *data) {}
982
* -------------------------------------------------------------------- */
985
- * Prefer the GPIO code since the TWI controller isn't robust
986
- * (gets overruns and underruns under load) and can only issue
987
- * repeated STARTs in one scenario (the driver doesn't yet handle them).
990
-#if defined(CONFIG_I2C_GPIO) || defined(CONFIG_I2C_GPIO_MODULE)
992
-static struct i2c_gpio_platform_data pdata = {
993
- .sda_pin = AT91_PIN_PA23,
994
- .sda_is_open_drain = 1,
995
- .scl_pin = AT91_PIN_PA24,
996
- .scl_is_open_drain = 1,
997
- .udelay = 2, /* ~100 kHz */
1000
-static struct platform_device at91sam9260_twi_device = {
1001
- .name = "i2c-gpio",
1003
- .dev.platform_data = &pdata,
1006
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1008
- at91_set_GPIO_periph(AT91_PIN_PA23, 1); /* TWD (SDA) */
1009
- at91_set_multi_drive(AT91_PIN_PA23, 1);
1011
- at91_set_GPIO_periph(AT91_PIN_PA24, 1); /* TWCK (SCL) */
1012
- at91_set_multi_drive(AT91_PIN_PA24, 1);
1014
- i2c_register_board_info(0, devices, nr_devices);
1015
- platform_device_register(&at91sam9260_twi_device);
1018
-#elif defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1019
+#if defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1021
static struct resource twi_resources[] = {
1023
@@ -409,7 +374,7 @@ static struct platform_device at91sam9260_twi_device = {
1024
.num_resources = ARRAY_SIZE(twi_resources),
1027
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1028
+void __init at91_add_device_i2c(void)
1030
/* pins used for TWI interface */
1031
at91_set_A_periph(AT91_PIN_PA23, 0); /* TWD */
1032
@@ -418,11 +383,10 @@ void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1033
at91_set_A_periph(AT91_PIN_PA24, 0); /* TWCK */
1034
at91_set_multi_drive(AT91_PIN_PA24, 1);
1036
- i2c_register_board_info(0, devices, nr_devices);
1037
platform_device_register(&at91sam9260_twi_device);
1040
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices) {}
1041
+void __init at91_add_device_i2c(void) {}
1045
diff --git a/arch/arm/mach-at91/at91sam9261_devices.c b/arch/arm/mach-at91/at91sam9261_devices.c
1046
index 64979a9..3576595 100644
1047
--- a/arch/arm/mach-at91/at91sam9261_devices.c
1048
+++ b/arch/arm/mach-at91/at91sam9261_devices.c
1050
#include <asm/mach/map.h>
1052
#include <linux/platform_device.h>
1053
-#include <linux/i2c-gpio.h>
1055
-#include <linux/fb.h>
1056
#include <video/atmel_lcdc.h>
1058
#include <asm/arch/board.h>
1059
@@ -277,40 +275,7 @@ void __init at91_add_device_nand(struct at91_nand_data *data) {}
1061
* -------------------------------------------------------------------- */
1064
- * Prefer the GPIO code since the TWI controller isn't robust
1065
- * (gets overruns and underruns under load) and can only issue
1066
- * repeated STARTs in one scenario (the driver doesn't yet handle them).
1068
-#if defined(CONFIG_I2C_GPIO) || defined(CONFIG_I2C_GPIO_MODULE)
1070
-static struct i2c_gpio_platform_data pdata = {
1071
- .sda_pin = AT91_PIN_PA7,
1072
- .sda_is_open_drain = 1,
1073
- .scl_pin = AT91_PIN_PA8,
1074
- .scl_is_open_drain = 1,
1075
- .udelay = 2, /* ~100 kHz */
1078
-static struct platform_device at91sam9261_twi_device = {
1079
- .name = "i2c-gpio",
1081
- .dev.platform_data = &pdata,
1084
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1086
- at91_set_GPIO_periph(AT91_PIN_PA7, 1); /* TWD (SDA) */
1087
- at91_set_multi_drive(AT91_PIN_PA7, 1);
1089
- at91_set_GPIO_periph(AT91_PIN_PA8, 1); /* TWCK (SCL) */
1090
- at91_set_multi_drive(AT91_PIN_PA8, 1);
1092
- i2c_register_board_info(0, devices, nr_devices);
1093
- platform_device_register(&at91sam9261_twi_device);
1096
-#elif defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1097
+#if defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1099
static struct resource twi_resources[] = {
1101
@@ -332,7 +297,7 @@ static struct platform_device at91sam9261_twi_device = {
1102
.num_resources = ARRAY_SIZE(twi_resources),
1105
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1106
+void __init at91_add_device_i2c(void)
1108
/* pins used for TWI interface */
1109
at91_set_A_periph(AT91_PIN_PA7, 0); /* TWD */
1110
@@ -341,11 +306,10 @@ void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1111
at91_set_A_periph(AT91_PIN_PA8, 0); /* TWCK */
1112
at91_set_multi_drive(AT91_PIN_PA8, 1);
1114
- i2c_register_board_info(0, devices, nr_devices);
1115
platform_device_register(&at91sam9261_twi_device);
1118
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices) {}
1119
+void __init at91_add_device_i2c(void) {}
1123
diff --git a/arch/arm/mach-at91/at91sam9263_devices.c b/arch/arm/mach-at91/at91sam9263_devices.c
1124
index ac329a9..f924bd5 100644
1125
--- a/arch/arm/mach-at91/at91sam9263_devices.c
1126
+++ b/arch/arm/mach-at91/at91sam9263_devices.c
1128
#include <asm/mach/map.h>
1130
#include <linux/platform_device.h>
1131
-#include <linux/i2c-gpio.h>
1133
-#include <linux/fb.h>
1134
#include <video/atmel_lcdc.h>
1136
#include <asm/arch/board.h>
1137
@@ -423,40 +421,7 @@ void __init at91_add_device_nand(struct at91_nand_data *data) {}
1139
* -------------------------------------------------------------------- */
1142
- * Prefer the GPIO code since the TWI controller isn't robust
1143
- * (gets overruns and underruns under load) and can only issue
1144
- * repeated STARTs in one scenario (the driver doesn't yet handle them).
1146
-#if defined(CONFIG_I2C_GPIO) || defined(CONFIG_I2C_GPIO_MODULE)
1148
-static struct i2c_gpio_platform_data pdata = {
1149
- .sda_pin = AT91_PIN_PB4,
1150
- .sda_is_open_drain = 1,
1151
- .scl_pin = AT91_PIN_PB5,
1152
- .scl_is_open_drain = 1,
1153
- .udelay = 2, /* ~100 kHz */
1156
-static struct platform_device at91sam9263_twi_device = {
1157
- .name = "i2c-gpio",
1159
- .dev.platform_data = &pdata,
1162
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1164
- at91_set_GPIO_periph(AT91_PIN_PB4, 1); /* TWD (SDA) */
1165
- at91_set_multi_drive(AT91_PIN_PB4, 1);
1167
- at91_set_GPIO_periph(AT91_PIN_PB5, 1); /* TWCK (SCL) */
1168
- at91_set_multi_drive(AT91_PIN_PB5, 1);
1170
- i2c_register_board_info(0, devices, nr_devices);
1171
- platform_device_register(&at91sam9263_twi_device);
1174
-#elif defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1175
+#if defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1177
static struct resource twi_resources[] = {
1179
@@ -478,7 +443,7 @@ static struct platform_device at91sam9263_twi_device = {
1180
.num_resources = ARRAY_SIZE(twi_resources),
1183
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1184
+void __init at91_add_device_i2c(void)
1186
/* pins used for TWI interface */
1187
at91_set_A_periph(AT91_PIN_PB4, 0); /* TWD */
1188
@@ -487,11 +452,10 @@ void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1189
at91_set_A_periph(AT91_PIN_PB5, 0); /* TWCK */
1190
at91_set_multi_drive(AT91_PIN_PB5, 1);
1192
- i2c_register_board_info(0, devices, nr_devices);
1193
platform_device_register(&at91sam9263_twi_device);
1196
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices) {}
1197
+void __init at91_add_device_i2c(void) {}
1201
diff --git a/arch/arm/mach-at91/at91sam9rl_devices.c b/arch/arm/mach-at91/at91sam9rl_devices.c
1202
index 2bd60a3..cd7532b 100644
1203
--- a/arch/arm/mach-at91/at91sam9rl_devices.c
1204
+++ b/arch/arm/mach-at91/at91sam9rl_devices.c
1206
#include <asm/mach/map.h>
1208
#include <linux/platform_device.h>
1209
-#include <linux/i2c-gpio.h>
1211
#include <linux/fb.h>
1213
#include <video/atmel_lcdc.h>
1215
#include <asm/arch/board.h>
1216
@@ -170,40 +169,7 @@ void __init at91_add_device_nand(struct at91_nand_data *data) {}
1218
* -------------------------------------------------------------------- */
1221
- * Prefer the GPIO code since the TWI controller isn't robust
1222
- * (gets overruns and underruns under load) and can only issue
1223
- * repeated STARTs in one scenario (the driver doesn't yet handle them).
1225
-#if defined(CONFIG_I2C_GPIO) || defined(CONFIG_I2C_GPIO_MODULE)
1227
-static struct i2c_gpio_platform_data pdata = {
1228
- .sda_pin = AT91_PIN_PA23,
1229
- .sda_is_open_drain = 1,
1230
- .scl_pin = AT91_PIN_PA24,
1231
- .scl_is_open_drain = 1,
1232
- .udelay = 2, /* ~100 kHz */
1235
-static struct platform_device at91sam9rl_twi_device = {
1236
- .name = "i2c-gpio",
1238
- .dev.platform_data = &pdata,
1241
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1243
- at91_set_GPIO_periph(AT91_PIN_PA23, 1); /* TWD (SDA) */
1244
- at91_set_multi_drive(AT91_PIN_PA23, 1);
1246
- at91_set_GPIO_periph(AT91_PIN_PA24, 1); /* TWCK (SCL) */
1247
- at91_set_multi_drive(AT91_PIN_PA24, 1);
1249
- i2c_register_board_info(0, devices, nr_devices);
1250
- platform_device_register(&at91sam9rl_twi_device);
1253
-#elif defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1254
+#if defined(CONFIG_I2C_AT91) || defined(CONFIG_I2C_AT91_MODULE)
1256
static struct resource twi_resources[] = {
1258
@@ -225,7 +191,7 @@ static struct platform_device at91sam9rl_twi_device = {
1259
.num_resources = ARRAY_SIZE(twi_resources),
1262
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1263
+void __init at91_add_device_i2c(void)
1265
/* pins used for TWI interface */
1266
at91_set_A_periph(AT91_PIN_PA23, 0); /* TWD */
1267
@@ -234,11 +200,10 @@ void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices)
1268
at91_set_A_periph(AT91_PIN_PA24, 0); /* TWCK */
1269
at91_set_multi_drive(AT91_PIN_PA24, 1);
1271
- i2c_register_board_info(0, devices, nr_devices);
1272
platform_device_register(&at91sam9rl_twi_device);
1275
-void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices) {}
1276
+void __init at91_add_device_i2c(void) {}
1280
diff --git a/arch/arm/mach-at91/board-carmeva.c b/arch/arm/mach-at91/board-carmeva.c
1281
index 0f08782..76ec856 100644
1282
--- a/arch/arm/mach-at91/board-carmeva.c
1283
+++ b/arch/arm/mach-at91/board-carmeva.c
1284
@@ -128,7 +128,7 @@ static void __init carmeva_board_init(void)
1286
at91_add_device_udc(&carmeva_udc_data);
1288
- at91_add_device_i2c(NULL, 0);
1289
+ at91_add_device_i2c();
1291
at91_add_device_spi(carmeva_spi_devices, ARRAY_SIZE(carmeva_spi_devices));
1293
diff --git a/arch/arm/mach-at91/board-csb337.c b/arch/arm/mach-at91/board-csb337.c
1294
index d0aa20c..dde0899 100644
1295
--- a/arch/arm/mach-at91/board-csb337.c
1296
+++ b/arch/arm/mach-at91/board-csb337.c
1298
#include <linux/mm.h>
1299
#include <linux/module.h>
1300
#include <linux/platform_device.h>
1301
+#include <linux/i2c.h>
1302
#include <linux/spi/spi.h>
1303
#include <linux/mtd/physmap.h>
1305
@@ -84,12 +85,12 @@ static struct at91_udc_data __initdata csb337_udc_data = {
1308
static struct i2c_board_info __initdata csb337_i2c_devices[] = {
1310
- I2C_BOARD_INFO("rtc-ds1307", 0x68),
1312
+ { I2C_BOARD_INFO("rtc-ds1307", 0x68),
1318
static struct at91_cf_data __initdata csb337_cf_data = {
1320
* connector P4 on the CSB 337 mates to
1321
@@ -167,7 +168,9 @@ static void __init csb337_board_init(void)
1323
at91_add_device_udc(&csb337_udc_data);
1325
- at91_add_device_i2c(csb337_i2c_devices, ARRAY_SIZE(csb337_i2c_devices));
1326
+ at91_add_device_i2c();
1327
+ i2c_register_board_info(0, csb337_i2c_devices,
1328
+ ARRAY_SIZE(csb337_i2c_devices));
1330
at91_set_gpio_input(AT91_PIN_PB22, 1); /* IOIS16 */
1331
at91_add_device_cf(&csb337_cf_data);
1332
diff --git a/arch/arm/mach-at91/board-csb637.c b/arch/arm/mach-at91/board-csb637.c
1333
index c5c721d..77f04b9 100644
1334
--- a/arch/arm/mach-at91/board-csb637.c
1335
+++ b/arch/arm/mach-at91/board-csb637.c
1336
@@ -129,7 +129,7 @@ static void __init csb637_board_init(void)
1338
at91_add_device_udc(&csb637_udc_data);
1340
- at91_add_device_i2c(NULL, 0);
1341
+ at91_add_device_i2c();
1343
at91_add_device_spi(NULL, 0);
1345
diff --git a/arch/arm/mach-at91/board-dk.c b/arch/arm/mach-at91/board-dk.c
1346
index 40c9e43..af49789 100644
1347
--- a/arch/arm/mach-at91/board-dk.c
1348
+++ b/arch/arm/mach-at91/board-dk.c
1349
@@ -124,19 +124,6 @@ static struct spi_board_info dk_spi_devices[] = {
1353
-static struct i2c_board_info __initdata dk_i2c_devices[] = {
1355
- I2C_BOARD_INFO("ics1523", 0x26),
1358
- I2C_BOARD_INFO("x9429", 0x28),
1361
- I2C_BOARD_INFO("at24c", 0x50),
1362
- .type = "24c1024",
1366
static struct mtd_partition __initdata dk_nand_partition[] = {
1368
.name = "NAND Partition 1",
1369
@@ -198,7 +185,7 @@ static void __init dk_board_init(void)
1371
at91_add_device_cf(&dk_cf_data);
1373
- at91_add_device_i2c(dk_i2c_devices, ARRAY_SIZE(dk_i2c_devices));
1374
+ at91_add_device_i2c();
1376
at91_add_device_spi(dk_spi_devices, ARRAY_SIZE(dk_spi_devices));
1377
#ifdef CONFIG_MTD_AT91_DATAFLASH_CARD
1378
diff --git a/arch/arm/mach-at91/board-eb9200.c b/arch/arm/mach-at91/board-eb9200.c
1379
index b7b79bb..20458b5 100644
1380
--- a/arch/arm/mach-at91/board-eb9200.c
1381
+++ b/arch/arm/mach-at91/board-eb9200.c
1382
@@ -91,14 +91,6 @@ static struct at91_mmc_data __initdata eb9200_mmc_data = {
1386
-static struct i2c_board_info __initdata eb9200_i2c_devices[] = {
1388
- I2C_BOARD_INFO("at24c", 0x50),
1394
static void __init eb9200_board_init(void)
1397
@@ -110,7 +102,7 @@ static void __init eb9200_board_init(void)
1399
at91_add_device_udc(&eb9200_udc_data);
1401
- at91_add_device_i2c(eb9200_i2c_devices, ARRAY_SIZE(eb9200_i2c_devices));
1402
+ at91_add_device_i2c();
1404
at91_add_device_cf(&eb9200_cf_data);
1406
diff --git a/arch/arm/mach-at91/board-ek.c b/arch/arm/mach-at91/board-ek.c
1407
index d05b1b2..322fdd7 100644
1408
--- a/arch/arm/mach-at91/board-ek.c
1409
+++ b/arch/arm/mach-at91/board-ek.c
1410
@@ -145,7 +145,7 @@ static void __init ek_board_init(void)
1411
at91_add_device_udc(&ek_udc_data);
1412
at91_set_multi_drive(ek_udc_data.pullup_pin, 1); /* pullup_pin is connected to reset */
1414
- at91_add_device_i2c(ek_i2c_devices, ARRAY_SIZE(ek_i2c_devices));
1415
+ at91_add_device_i2c();
1417
at91_add_device_spi(ek_spi_devices, ARRAY_SIZE(ek_spi_devices));
1418
#ifdef CONFIG_MTD_AT91_DATAFLASH_CARD
1419
diff --git a/arch/arm/mach-at91/board-kafa.c b/arch/arm/mach-at91/board-kafa.c
1420
index cf1b7b2..c77d84c 100644
1421
--- a/arch/arm/mach-at91/board-kafa.c
1422
+++ b/arch/arm/mach-at91/board-kafa.c
1423
@@ -92,7 +92,7 @@ static void __init kafa_board_init(void)
1425
at91_add_device_udc(&kafa_udc_data);
1427
- at91_add_device_i2c(NULL, 0);
1428
+ at91_add_device_i2c();
1430
at91_add_device_spi(NULL, 0);
1432
diff --git a/arch/arm/mach-at91/board-kb9202.c b/arch/arm/mach-at91/board-kb9202.c
1433
index 4b39b9c..7d9b1a2 100644
1434
--- a/arch/arm/mach-at91/board-kb9202.c
1435
+++ b/arch/arm/mach-at91/board-kb9202.c
1436
@@ -124,7 +124,7 @@ static void __init kb9202_board_init(void)
1438
at91_add_device_mmc(0, &kb9202_mmc_data);
1440
- at91_add_device_i2c(NULL, 0);
1441
+ at91_add_device_i2c();
1443
at91_add_device_spi(NULL, 0);
1445
diff --git a/arch/arm/mach-at91/board-picotux200.c b/arch/arm/mach-at91/board-picotux200.c
1446
index 6acb55c..49cfe7a 100644
1447
--- a/arch/arm/mach-at91/board-picotux200.c
1448
+++ b/arch/arm/mach-at91/board-picotux200.c
1449
@@ -139,7 +139,7 @@ static void __init picotux200_board_init(void)
1450
// at91_add_device_udc(&picotux200_udc_data);
1451
// at91_set_multi_drive(picotux200_udc_data.pullup_pin, 1); /* pullup_pin is connected to reset */
1453
- at91_add_device_i2c(NULL, 0);
1454
+ at91_add_device_i2c();
1456
// at91_add_device_spi(picotux200_spi_devices, ARRAY_SIZE(picotux200_spi_devices));
1457
#ifdef CONFIG_MTD_AT91_DATAFLASH_CARD
1458
diff --git a/arch/arm/mach-at91/board-sam9260ek.c b/arch/arm/mach-at91/board-sam9260ek.c
1459
index b343a6c..65fa532 100644
1460
--- a/arch/arm/mach-at91/board-sam9260ek.c
1461
+++ b/arch/arm/mach-at91/board-sam9260ek.c
1462
@@ -189,7 +189,7 @@ static void __init ek_board_init(void)
1464
at91_add_device_mmc(0, &ek_mmc_data);
1466
- at91_add_device_i2c(NULL, 0);
1467
+ at91_add_device_i2c();
1470
MACHINE_START(AT91SAM9260EK, "Atmel AT91SAM9260-EK")
1471
diff --git a/arch/arm/mach-at91/board-sam9261ek.c b/arch/arm/mach-at91/board-sam9261ek.c
1472
index 550ae59..42e172c 100644
1473
--- a/arch/arm/mach-at91/board-sam9261ek.c
1474
+++ b/arch/arm/mach-at91/board-sam9261ek.c
1475
@@ -382,14 +382,14 @@ static struct platform_device ek_button_device = {
1477
static void __init ek_add_device_buttons(void)
1479
- at91_set_gpio_input(AT91_PIN_PA27, 0); /* btn0 */
1480
- at91_set_deglitch(AT91_PIN_PA27, 1);
1481
- at91_set_gpio_input(AT91_PIN_PA26, 0); /* btn1 */
1482
- at91_set_deglitch(AT91_PIN_PA26, 1);
1483
- at91_set_gpio_input(AT91_PIN_PA25, 0); /* btn2 */
1484
- at91_set_deglitch(AT91_PIN_PA25, 1);
1485
- at91_set_gpio_input(AT91_PIN_PA24, 0); /* btn3 */
1486
- at91_set_deglitch(AT91_PIN_PA24, 1);
1487
+ at91_set_gpio_input(AT91_PIN_PB27, 0); /* btn0 */
1488
+ at91_set_deglitch(AT91_PIN_PB27, 1);
1489
+ at91_set_gpio_input(AT91_PIN_PB26, 0); /* btn1 */
1490
+ at91_set_deglitch(AT91_PIN_PB26, 1);
1491
+ at91_set_gpio_input(AT91_PIN_PB25, 0); /* btn2 */
1492
+ at91_set_deglitch(AT91_PIN_PB25, 1);
1493
+ at91_set_gpio_input(AT91_PIN_PB24, 0); /* btn3 */
1494
+ at91_set_deglitch(AT91_PIN_PB24, 1);
1496
platform_device_register(&ek_button_device);
1498
@@ -406,7 +406,7 @@ static void __init ek_board_init(void)
1500
at91_add_device_udc(&ek_udc_data);
1502
- at91_add_device_i2c(NULL, 0);
1503
+ at91_add_device_i2c();
1505
at91_add_device_nand(&ek_nand_data);
1506
/* DM9000 ethernet */
1507
diff --git a/arch/arm/mach-at91/board-sam9263ek.c b/arch/arm/mach-at91/board-sam9263ek.c
1508
index ab9dcc0..2a1cc73 100644
1509
--- a/arch/arm/mach-at91/board-sam9263ek.c
1510
+++ b/arch/arm/mach-at91/board-sam9263ek.c
1511
@@ -291,7 +291,7 @@ static void __init ek_board_init(void)
1513
at91_add_device_nand(&ek_nand_data);
1515
- at91_add_device_i2c(NULL, 0);
1516
+ at91_add_device_i2c();
1517
/* LCD Controller */
1518
at91_add_device_lcdc(&ek_lcdc_data);
1520
diff --git a/arch/arm/mach-at91/board-sam9rlek.c b/arch/arm/mach-at91/board-sam9rlek.c
1521
index bc0546d..9b61320 100644
1522
--- a/arch/arm/mach-at91/board-sam9rlek.c
1523
+++ b/arch/arm/mach-at91/board-sam9rlek.c
1524
@@ -181,7 +181,7 @@ static void __init ek_board_init(void)
1526
at91_add_device_serial();
1528
- at91_add_device_i2c(NULL, 0);
1529
+ at91_add_device_i2c();
1531
at91_add_device_nand(&ek_nand_data);
1533
diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c
1534
index 57c3b64..848efb2 100644
1535
--- a/arch/arm/mach-at91/clock.c
1536
+++ b/arch/arm/mach-at91/clock.c
1537
@@ -351,7 +351,7 @@ static void init_programmable_clock(struct clk *clk)
1538
pckr = at91_sys_read(AT91_PMC_PCKR(clk->id));
1539
parent = at91_css_to_clk(pckr & AT91_PMC_CSS);
1540
clk->parent = parent;
1541
- clk->rate_hz = parent->rate_hz / (1 << ((pckr & AT91_PMC_PRES) >> 2));
1542
+ clk->rate_hz = parent->rate_hz / (1 << ((pckr >> 2) & 3));
1545
#endif /* CONFIG_AT91_PROGRAMMABLE_CLOCKS */
1546
@@ -587,11 +587,8 @@ int __init at91_clock_init(unsigned long main_clock)
1547
mckr = at91_sys_read(AT91_PMC_MCKR);
1548
mck.parent = at91_css_to_clk(mckr & AT91_PMC_CSS);
1549
freq = mck.parent->rate_hz;
1550
- freq /= (1 << ((mckr & AT91_PMC_PRES) >> 2)); /* prescale */
1551
- if (cpu_is_at91rm9200())
1552
- mck.rate_hz = freq / (1 + ((mckr & AT91_PMC_MDIV) >> 8)); /* mdiv */
1554
- mck.rate_hz = freq / (1 << ((mckr & AT91_PMC_MDIV) >> 8)); /* mdiv */
1555
+ freq /= (1 << ((mckr >> 2) & 3)); /* prescale */
1556
+ mck.rate_hz = freq / (1 + ((mckr >> 8) & 3)); /* mdiv */
1558
/* Register the PMC's standard clocks */
1559
for (i = 0; i < ARRAY_SIZE(standard_pmc_clocks); i++)
1560
diff --git a/arch/arm/mach-imx/irq.c b/arch/arm/mach-imx/irq.c
1561
index a7465db..0791b56 100644
1562
--- a/arch/arm/mach-imx/irq.c
1563
+++ b/arch/arm/mach-imx/irq.c
1568
-#define INTCNTL_OFF 0x00
1569
-#define NIMASK_OFF 0x04
1570
-#define INTENNUM_OFF 0x08
1571
-#define INTDISNUM_OFF 0x0C
1572
-#define INTENABLEH_OFF 0x10
1573
-#define INTENABLEL_OFF 0x14
1574
-#define INTTYPEH_OFF 0x18
1575
-#define INTTYPEL_OFF 0x1C
1576
-#define NIPRIORITY_OFF(x) (0x20+4*(7-(x)))
1577
-#define NIVECSR_OFF 0x40
1578
-#define FIVECSR_OFF 0x44
1579
-#define INTSRCH_OFF 0x48
1580
-#define INTSRCL_OFF 0x4C
1581
-#define INTFRCH_OFF 0x50
1582
-#define INTFRCL_OFF 0x54
1583
-#define NIPNDH_OFF 0x58
1584
-#define NIPNDL_OFF 0x5C
1585
-#define FIPNDH_OFF 0x60
1586
-#define FIPNDL_OFF 0x64
1587
+#define INTENNUM_OFF 0x8
1588
+#define INTDISNUM_OFF 0xC
1590
#define VA_AITC_BASE IO_ADDRESS(IMX_AITC_BASE)
1591
-#define IMX_AITC_INTCNTL (VA_AITC_BASE + INTCNTL_OFF)
1592
-#define IMX_AITC_NIMASK (VA_AITC_BASE + NIMASK_OFF)
1593
-#define IMX_AITC_INTENNUM (VA_AITC_BASE + INTENNUM_OFF)
1594
#define IMX_AITC_INTDISNUM (VA_AITC_BASE + INTDISNUM_OFF)
1595
-#define IMX_AITC_INTENABLEH (VA_AITC_BASE + INTENABLEH_OFF)
1596
-#define IMX_AITC_INTENABLEL (VA_AITC_BASE + INTENABLEL_OFF)
1597
-#define IMX_AITC_INTTYPEH (VA_AITC_BASE + INTTYPEH_OFF)
1598
-#define IMX_AITC_INTTYPEL (VA_AITC_BASE + INTTYPEL_OFF)
1599
-#define IMX_AITC_NIPRIORITY(x) (VA_AITC_BASE + NIPRIORITY_OFF(x))
1600
-#define IMX_AITC_NIVECSR (VA_AITC_BASE + NIVECSR_OFF)
1601
-#define IMX_AITC_FIVECSR (VA_AITC_BASE + FIVECSR_OFF)
1602
-#define IMX_AITC_INTSRCH (VA_AITC_BASE + INTSRCH_OFF)
1603
-#define IMX_AITC_INTSRCL (VA_AITC_BASE + INTSRCL_OFF)
1604
-#define IMX_AITC_INTFRCH (VA_AITC_BASE + INTFRCH_OFF)
1605
-#define IMX_AITC_INTFRCL (VA_AITC_BASE + INTFRCL_OFF)
1606
-#define IMX_AITC_NIPNDH (VA_AITC_BASE + NIPNDH_OFF)
1607
-#define IMX_AITC_NIPNDL (VA_AITC_BASE + NIPNDL_OFF)
1608
-#define IMX_AITC_FIPNDH (VA_AITC_BASE + FIPNDH_OFF)
1609
-#define IMX_AITC_FIPNDL (VA_AITC_BASE + FIPNDL_OFF)
1610
+#define IMX_AITC_INTENNUM (VA_AITC_BASE + INTENNUM_OFF)
1613
#define DEBUG_IRQ(fmt...) printk(fmt)
1614
@@ -256,12 +222,7 @@ imx_init_irq(void)
1616
DEBUG_IRQ("Initializing imx interrupts\n");
1618
- /* Disable all interrupts initially. */
1619
- /* Do not rely on the bootloader. */
1620
- __raw_writel(0, IMX_AITC_INTENABLEH);
1621
- __raw_writel(0, IMX_AITC_INTENABLEL);
1623
- /* Mask all GPIO interrupts as well */
1624
+ /* Mask all interrupts initially */
1628
@@ -284,6 +245,6 @@ imx_init_irq(void)
1629
set_irq_chained_handler(GPIO_INT_PORTC, imx_gpioc_demux_handler);
1630
set_irq_chained_handler(GPIO_INT_PORTD, imx_gpiod_demux_handler);
1632
- /* Release masking of interrupts according to priority */
1633
- __raw_writel(-1, IMX_AITC_NIMASK);
1634
+ /* Disable all interrupts initially. */
1635
+ /* In IMX this is done in the bootloader. */
1637
diff --git a/arch/arm/mach-pxa/pxa27x.c b/arch/arm/mach-pxa/pxa27x.c
1638
index 8e126e6..d0f2b59 100644
1639
--- a/arch/arm/mach-pxa/pxa27x.c
1640
+++ b/arch/arm/mach-pxa/pxa27x.c
1641
@@ -146,7 +146,7 @@ static struct clk pxa27x_clks[] = {
1642
INIT_CKEN("MMCCLK", MMC, 19500000, 0, &pxa_device_mci.dev),
1643
INIT_CKEN("FICPCLK", FICP, 48000000, 0, &pxa_device_ficp.dev),
1645
- INIT_CKEN("USBCLK", USBHOST, 48000000, 0, &pxa27x_device_ohci.dev),
1646
+ INIT_CKEN("USBCLK", USB, 48000000, 0, &pxa27x_device_ohci.dev),
1647
INIT_CKEN("I2CCLK", PWRI2C, 13000000, 0, &pxa27x_device_i2c_power.dev),
1648
INIT_CKEN("KBDCLK", KEYPAD, 32768, 0, NULL),
1650
diff --git a/arch/arm/mach-pxa/pxa320.c b/arch/arm/mach-pxa/pxa320.c
1651
index 74128eb..1010f77 100644
1652
--- a/arch/arm/mach-pxa/pxa320.c
1653
+++ b/arch/arm/mach-pxa/pxa320.c
1655
static struct pxa3xx_mfp_addr_map pxa320_mfp_addr_map[] __initdata = {
1657
MFP_ADDR_X(GPIO0, GPIO4, 0x0124),
1658
- MFP_ADDR_X(GPIO5, GPIO9, 0x028C),
1659
- MFP_ADDR(GPIO10, 0x0458),
1660
- MFP_ADDR_X(GPIO11, GPIO26, 0x02A0),
1661
- MFP_ADDR_X(GPIO27, GPIO48, 0x0400),
1662
- MFP_ADDR_X(GPIO49, GPIO62, 0x045C),
1663
+ MFP_ADDR_X(GPIO5, GPIO26, 0x028C),
1664
+ MFP_ADDR_X(GPIO27, GPIO62, 0x0400),
1665
MFP_ADDR_X(GPIO63, GPIO73, 0x04B4),
1666
MFP_ADDR_X(GPIO74, GPIO98, 0x04F0),
1667
MFP_ADDR_X(GPIO99, GPIO127, 0x0600),
1668
diff --git a/arch/arm/mach-pxa/ssp.c b/arch/arm/mach-pxa/ssp.c
1669
index 422afee..71766ac 100644
1670
--- a/arch/arm/mach-pxa/ssp.c
1671
+++ b/arch/arm/mach-pxa/ssp.c
1672
@@ -309,7 +309,6 @@ void ssp_exit(struct ssp_dev *dev)
1674
if (dev->port > PXA_SSP_PORTS || dev->port == 0) {
1675
printk(KERN_WARNING "SSP: tried to close invalid port\n");
1676
- mutex_unlock(&mutex);
1680
diff --git a/arch/cris/arch-v10/drivers/Kconfig b/arch/cris/arch-v10/drivers/Kconfig
1681
index e3c0f29..faf8b4d 100644
1682
--- a/arch/cris/arch-v10/drivers/Kconfig
1683
+++ b/arch/cris/arch-v10/drivers/Kconfig
1684
@@ -542,6 +542,45 @@ config ETRAX_RS485_DISABLE_RECEIVER
1685
loopback. Not all products are able to do this in software only.
1686
Axis 2400/2401 must disable receiver.
1689
+ bool "ATA/IDE support"
1691
+ select BLK_DEV_IDE
1692
+ select BLK_DEV_IDEDISK
1693
+ select BLK_DEV_IDECD
1694
+ select BLK_DEV_IDEDMA
1695
+ select IDE_GENERIC
1697
+ Enable this to get support for ATA/IDE.
1698
+ You can't use parallel ports or SCSI ports
1702
+config ETRAX_IDE_DELAY
1703
+ int "Delay for drives to regain consciousness"
1704
+ depends on ETRAX_IDE
1707
+ Number of seconds to wait for IDE drives to spin up after an IDE
1710
+ prompt "IDE reset pin"
1711
+ depends on ETRAX_IDE
1712
+ default ETRAX_IDE_PB7_RESET
1714
+config ETRAX_IDE_PB7_RESET
1715
+ bool "Port_PB_Bit_7"
1717
+ IDE reset on pin 7 on port B
1719
+config ETRAX_IDE_G27_RESET
1720
+ bool "Port_G_Bit_27"
1722
+ IDE reset on pin 27 on port G
1727
config ETRAX_USB_HOST
1730
diff --git a/arch/cris/arch-v32/drivers/Kconfig b/arch/cris/arch-v32/drivers/Kconfig
1731
index 9bccb5e..7f72d7c 100644
1732
--- a/arch/cris/arch-v32/drivers/Kconfig
1733
+++ b/arch/cris/arch-v32/drivers/Kconfig
1734
@@ -582,6 +582,18 @@ config ETRAX_PE_CHANGEABLE_BITS
1735
that a user can change the value on using ioctl's.
1736
Bit set = changeable.
1739
+ bool "ATA/IDE support"
1740
+ depends on ETRAX_ARCH_V32
1742
+ select BLK_DEV_IDE
1743
+ select BLK_DEV_IDEDISK
1744
+ select BLK_DEV_IDECD
1745
+ select BLK_DEV_IDEDMA
1746
+ select IDE_GENERIC
1748
+ Enables the ETRAX IDE driver.
1750
config ETRAX_CARDBUS
1751
bool "Cardbus support"
1752
depends on ETRAX_ARCH_V32
1753
diff --git a/arch/frv/kernel/break.S b/arch/frv/kernel/break.S
1754
index bd0bdf9..dac4a5f 100644
1755
--- a/arch/frv/kernel/break.S
1756
+++ b/arch/frv/kernel/break.S
1757
@@ -63,7 +63,7 @@ __break_trace_through_exceptions:
1758
# entry point for Break Exceptions/Interrupts
1760
###############################################################################
1761
- .section .text.break
1764
.globl __entry_break
1766
diff --git a/arch/frv/kernel/entry.S b/arch/frv/kernel/entry.S
1767
index f926c70..1e74f3c 100644
1768
--- a/arch/frv/kernel/entry.S
1769
+++ b/arch/frv/kernel/entry.S
1772
#define nr_syscalls ((syscall_table_size)/4)
1774
- .section .text.entry
1779
diff --git a/arch/frv/kernel/vmlinux.lds.S b/arch/frv/kernel/vmlinux.lds.S
1780
index a17a81d..3b71e0c 100644
1781
--- a/arch/frv/kernel/vmlinux.lds.S
1782
+++ b/arch/frv/kernel/vmlinux.lds.S
1783
@@ -76,12 +76,6 @@ SECTIONS
1788
- .data.page_aligned : { *(.data.idt) }
1790
- . = ALIGN(L1_CACHE_BYTES);
1791
- .data.cacheline_aligned : { *(.data.cacheline_aligned) }
1794
/* trap table management - read entry-table.S before modifying */
1796
@@ -92,25 +86,28 @@ SECTIONS
1801
+ .data.page_aligned : { *(.data.idt) }
1803
+ . = ALIGN(L1_CACHE_BYTES);
1804
+ .data.cacheline_aligned : { *(.data.cacheline_aligned) }
1806
/* Text and read-only data */
1818
-#ifdef CONFIG_DEBUG_INFO
1820
+ .text.start .text.*
1821
+#ifdef CONFIG_DEBUG_INFO
1834
diff --git a/arch/frv/mm/tlb-miss.S b/arch/frv/mm/tlb-miss.S
1835
index 0764348..04da674 100644
1836
--- a/arch/frv/mm/tlb-miss.S
1837
+++ b/arch/frv/mm/tlb-miss.S
1839
#include <asm/highmem.h>
1840
#include <asm/spr-regs.h>
1842
- .section .text.tlbmiss
1846
.globl __entry_insn_mmu_miss
1847
diff --git a/arch/m32r/kernel/signal.c b/arch/m32r/kernel/signal.c
1848
index 1812454..a753d79 100644
1849
--- a/arch/m32r/kernel/signal.c
1850
+++ b/arch/m32r/kernel/signal.c
1851
@@ -36,7 +36,7 @@ sys_rt_sigsuspend(sigset_t __user *unewset, size_t sigsetsize,
1852
unsigned long r2, unsigned long r3, unsigned long r4,
1853
unsigned long r5, unsigned long r6, struct pt_regs *regs)
1856
+ sigset_t saveset, newset;
1858
/* XXX: Don't preclude handling different sized sigset_t's. */
1859
if (sigsetsize != sizeof(sigset_t))
1860
@@ -44,18 +44,21 @@ sys_rt_sigsuspend(sigset_t __user *unewset, size_t sigsetsize,
1862
if (copy_from_user(&newset, unewset, sizeof(newset)))
1864
- sigdelsetmask(&newset, sigmask(SIGKILL)|sigmask(SIGSTOP));
1865
+ sigdelsetmask(&newset, ~_BLOCKABLE);
1867
spin_lock_irq(¤t->sighand->siglock);
1868
- current->saved_sigmask = current->blocked;
1869
+ saveset = current->blocked;
1870
current->blocked = newset;
1871
recalc_sigpending();
1872
spin_unlock_irq(¤t->sighand->siglock);
1874
- current->state = TASK_INTERRUPTIBLE;
1876
- set_thread_flag(TIF_RESTORE_SIGMASK);
1877
- return -ERESTARTNOHAND;
1878
+ regs->r0 = -EINTR;
1880
+ current->state = TASK_INTERRUPTIBLE;
1882
+ if (do_signal(regs, &saveset))
1888
diff --git a/arch/m32r/kernel/syscall_table.S b/arch/m32r/kernel/syscall_table.S
1889
index 95aa798..751ac2a 100644
1890
--- a/arch/m32r/kernel/syscall_table.S
1891
+++ b/arch/m32r/kernel/syscall_table.S
1892
@@ -284,43 +284,3 @@ ENTRY(sys_call_table)
1893
.long sys_mq_getsetattr
1894
.long sys_ni_syscall /* reserved for kexec */
1896
- .long sys_ni_syscall /* 285 */ /* available */
1898
- .long sys_request_key
1900
- .long sys_ioprio_set
1901
- .long sys_ioprio_get /* 290 */
1902
- .long sys_inotify_init
1903
- .long sys_inotify_add_watch
1904
- .long sys_inotify_rm_watch
1905
- .long sys_migrate_pages
1906
- .long sys_openat /* 295 */
1909
- .long sys_fchownat
1910
- .long sys_futimesat
1911
- .long sys_fstatat64 /* 300 */
1912
- .long sys_unlinkat
1913
- .long sys_renameat
1915
- .long sys_symlinkat
1916
- .long sys_readlinkat /* 305 */
1917
- .long sys_fchmodat
1918
- .long sys_faccessat
1919
- .long sys_pselect6
1921
- .long sys_unshare /* 310 */
1922
- .long sys_set_robust_list
1923
- .long sys_get_robust_list
1925
- .long sys_sync_file_range
1926
- .long sys_tee /* 315 */
1927
- .long sys_vmsplice
1928
- .long sys_move_pages
1930
- .long sys_epoll_pwait
1931
- .long sys_utimensat /* 320 */
1932
- .long sys_signalfd
1935
- .long sys_fallocate
1936
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
1937
index 0e2b5cd..74c5c62 100644
1938
--- a/arch/mips/kernel/csrc-r4k.c
1939
+++ b/arch/mips/kernel/csrc-r4k.c
1942
* Copyright (C) 2007 by Ralf Baechle
1944
-#include <linux/clocksource.h>
1945
-#include <linux/init.h>
1947
-#include <asm/time.h>
1949
static cycle_t c0_hpt_read(void)
1951
@@ -22,7 +18,7 @@ static struct clocksource clocksource_mips = {
1952
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
1955
-void __init init_mips_clocksource(void)
1956
+static void __init init_mips_clocksource(void)
1958
/* Calclate a somewhat reasonable rating value */
1959
clocksource_mips.rating = 200 + mips_hpt_frequency / 10000000;
1960
diff --git a/arch/mips/sgi-ip22/ip22-setup.c b/arch/mips/sgi-ip22/ip22-setup.c
1961
index 5f389ee..174f09e 100644
1962
--- a/arch/mips/sgi-ip22/ip22-setup.c
1963
+++ b/arch/mips/sgi-ip22/ip22-setup.c
1965
unsigned long sgi_gfxaddr;
1966
EXPORT_SYMBOL_GPL(sgi_gfxaddr);
1969
+ * Stop-A is originally a Sun thing that isn't standard on IP22 so to avoid
1970
+ * accidents it's disabled by default on IP22.
1972
+ * FIXME: provide a mechanism to change the value of stop_a_enabled.
1974
+int stop_a_enabled;
1976
+void ip22_do_break(void)
1978
+ if (!stop_a_enabled)
1982
+ ArcEnterInteractiveMode();
1985
+EXPORT_SYMBOL(ip22_do_break);
1987
extern void ip22_be_init(void) __init;
1989
void __init plat_mem_setup(void)
1990
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
1991
index 232c298..18f397c 100644
1992
--- a/arch/powerpc/Kconfig
1993
+++ b/arch/powerpc/Kconfig
1994
@@ -187,11 +187,6 @@ config FORCE_MAX_ZONEORDER
1995
default "9" if PPC_64K_PAGES
1998
-config HUGETLB_PAGE_SIZE_VARIABLE
2000
- depends on HUGETLB_PAGE
2003
config MATH_EMULATION
2004
bool "Math emulation"
2005
depends on 4xx || 8xx || E200 || PPC_MPC832x || E500
2006
diff --git a/arch/um/Makefile b/arch/um/Makefile
2007
index ba6813a..31999bc 100644
2008
--- a/arch/um/Makefile
2009
+++ b/arch/um/Makefile
2010
@@ -168,7 +168,7 @@ ifneq ($(KBUILD_SRC),)
2011
$(Q)mkdir -p $(objtree)/include/asm-um
2012
$(Q)ln -fsn $(srctree)/include/asm-$(HEADER_ARCH) include/asm-um/arch
2014
- $(Q)cd $(TOPDIR)/include/asm-um && ln -fsn ../asm-$(HEADER_ARCH) arch
2015
+ $(Q)cd $(TOPDIR)/include/asm-um && ln -fsn ../asm-$(SUBARCH) arch
2018
$(objtree)/$(ARCH_DIR)/include:
2019
diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
2020
index b1a77b1..7e6cdde 100644
2021
--- a/arch/um/drivers/ubd_kern.c
2022
+++ b/arch/um/drivers/ubd_kern.c
2023
@@ -1128,7 +1128,6 @@ static void do_ubd_request(struct request_queue *q)
2024
"errno = %d\n", -n);
2025
else if(list_empty(&dev->restart))
2026
list_add(&dev->restart, &restart);
2031
diff --git a/arch/um/os-Linux/time.c b/arch/um/os-Linux/time.c
2032
index ef02d94..e34e1ef 100644
2033
--- a/arch/um/os-Linux/time.c
2034
+++ b/arch/um/os-Linux/time.c
2035
@@ -59,7 +59,7 @@ long long disable_timer(void)
2037
struct itimerval time = ((struct itimerval) { { 0, 0 }, { 0, 0 } });
2039
- if (setitimer(ITIMER_VIRTUAL, &time, &time) < 0)
2040
+ if(setitimer(ITIMER_VIRTUAL, &time, &time) < 0)
2041
printk(UM_KERN_ERR "disable_timer - setitimer failed, "
2042
"errno = %d\n", errno);
2044
@@ -74,61 +74,13 @@ long long os_nsecs(void)
2045
return timeval_to_ns(&tv);
2048
-#ifdef UML_CONFIG_NO_HZ
2049
-static int after_sleep_interval(struct timespec *ts)
2053
-static inline long long timespec_to_us(const struct timespec *ts)
2055
- return ((long long) ts->tv_sec * UM_USEC_PER_SEC) +
2056
- ts->tv_nsec / UM_NSEC_PER_USEC;
2059
-static int after_sleep_interval(struct timespec *ts)
2061
- int usec = UM_USEC_PER_SEC / UM_HZ;
2062
- long long start_usecs = timespec_to_us(ts);
2063
- struct timeval tv;
2064
- struct itimerval interval;
2067
- * It seems that rounding can increase the value returned from
2068
- * setitimer to larger than the one passed in. Over time,
2069
- * this will cause the remaining time to be greater than the
2070
- * tick interval. If this happens, then just reduce the first
2071
- * tick to the interval value.
2073
- if (start_usecs > usec)
2074
- start_usecs = usec;
2075
- tv = ((struct timeval) { .tv_sec = start_usecs / UM_USEC_PER_SEC,
2076
- .tv_usec = start_usecs % UM_USEC_PER_SEC });
2077
- interval = ((struct itimerval) { { 0, usec }, tv });
2079
- if (setitimer(ITIMER_VIRTUAL, &interval, NULL) == -1)
2086
extern void alarm_handler(int sig, struct sigcontext *sc);
2088
void idle_sleep(unsigned long long nsecs)
2090
- struct timespec ts;
2093
- * nsecs can come in as zero, in which case, this starts a
2094
- * busy loop. To prevent this, reset nsecs to the tick
2095
- * interval if it is zero.
2098
- nsecs = UM_NSEC_PER_SEC / UM_HZ;
2099
- ts = ((struct timespec) { .tv_sec = nsecs / UM_NSEC_PER_SEC,
2100
- .tv_nsec = nsecs % UM_NSEC_PER_SEC });
2101
+ struct timespec ts = { .tv_sec = nsecs / UM_NSEC_PER_SEC,
2102
+ .tv_nsec = nsecs % UM_NSEC_PER_SEC };
2104
if (nanosleep(&ts, &ts) == 0)
2105
alarm_handler(SIGVTALRM, NULL);
2106
- after_sleep_interval(&ts);
2108
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
2109
index 368864d..eded44e 100644
2110
--- a/arch/x86/Kconfig
2111
+++ b/arch/x86/Kconfig
2112
@@ -112,6 +112,9 @@ config GENERIC_TIME_VSYSCALL
2116
+config ARCH_SUPPORTS_KVM
2122
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
2123
index 4cc5b04..6ef5a06 100644
2124
--- a/arch/x86/boot/header.S
2125
+++ b/arch/x86/boot/header.S
2126
@@ -236,30 +236,39 @@ start_of_setup:
2130
-# Apparently some ancient versions of LILO invoked the kernel with %ss != %ds,
2131
-# which happened to work by accident for the old code. Recalculate the stack
2132
-# pointer if %ss is invalid. Otherwise leave it alone, LOADLIN sets up the
2133
-# stack behind its own code, so we can't blindly put it directly past the heap.
2134
+# Apparently some ancient versions of LILO invoked the kernel
2135
+# with %ss != %ds, which happened to work by accident for the
2136
+# old code. If the CAN_USE_HEAP flag is set in loadflags, or
2137
+# %ss != %ds, then adjust the stack pointer.
2139
+ # Smallest possible stack we can tolerate
2140
+ movw $(_end+STACK_SIZE), %cx
2142
+ movw heap_end_ptr, %dx
2145
+ xorw %dx, %dx # Wraparound - whole segment available
2146
+1: testb $CAN_USE_HEAP, loadflags
2151
cmpw %ax, %dx # %ds == %ss?
2153
- je 2f # -> assume %sp is reasonably set
2155
- # Invalid %ss, make up a new stack
2157
- testb $CAN_USE_HEAP, loadflags
2159
- movw heap_end_ptr, %dx
2160
-1: addw $STACK_SIZE, %dx
2162
- xorw %dx, %dx # Prevent wraparound
2163
+ # If so, assume %sp is reasonably set, otherwise use
2164
+ # the smallest possible stack.
2165
+ jne 4f # -> Smallest possible stack...
2167
-2: # Now %dx should point to the end of our stack space
2168
+ # Make sure the stack is at least minimum size. Take a value
2169
+ # of zero to mean "full segment."
2171
andw $~3, %dx # dword align (might as well...)
2173
movw $0xfffc, %dx # Make sure we're not zero
2177
+4: movw %cx, %dx # Minimum value we can possibly use
2179
movzwl %dx, %esp # Clear upper half of %esp
2180
sti # Now we should have a working stack
2182
diff --git a/arch/x86/kernel/paravirt_32.c b/arch/x86/kernel/paravirt_32.c
2183
index f500079..6a80d67 100644
2184
--- a/arch/x86/kernel/paravirt_32.c
2185
+++ b/arch/x86/kernel/paravirt_32.c
2186
@@ -465,8 +465,8 @@ struct pv_mmu_ops pv_mmu_ops = {
2189
EXPORT_SYMBOL_GPL(pv_time_ops);
2190
-EXPORT_SYMBOL (pv_cpu_ops);
2191
-EXPORT_SYMBOL (pv_mmu_ops);
2192
+EXPORT_SYMBOL_GPL(pv_cpu_ops);
2193
+EXPORT_SYMBOL_GPL(pv_mmu_ops);
2194
EXPORT_SYMBOL_GPL(pv_apic_ops);
2195
EXPORT_SYMBOL_GPL(pv_info);
2196
EXPORT_SYMBOL (pv_irq_ops);
2197
diff --git a/arch/x86/lguest/Kconfig b/arch/x86/lguest/Kconfig
2198
index 19626ac..c4dffbe 100644
2199
--- a/arch/x86/lguest/Kconfig
2200
+++ b/arch/x86/lguest/Kconfig
2201
@@ -2,7 +2,6 @@ config LGUEST_GUEST
2202
bool "Lguest guest support"
2205
- depends on !(X86_VISWS || X86_VOYAGER)
2208
select VIRTIO_CONSOLE
2209
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
2210
index 0f9c8c8..a7308b2 100644
2211
--- a/arch/x86/mm/init_64.c
2212
+++ b/arch/x86/mm/init_64.c
2213
@@ -345,7 +345,7 @@ static void __init find_early_table_space(unsigned long end)
2214
/* Setup the direct mapping of the physical memory at PAGE_OFFSET.
2215
This runs before bootmem is initialized and gets pages directly from the
2216
physical memory. To access them they are temporarily mapped. */
2217
-void __init_refok init_memory_mapping(unsigned long start, unsigned long end)
2218
+void __meminit init_memory_mapping(unsigned long start, unsigned long end)
2222
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
2223
index 0ac6c5d..b2e32f9 100644
2224
--- a/arch/x86/xen/mmu.c
2225
+++ b/arch/x86/xen/mmu.c
2226
@@ -244,8 +244,6 @@ pte_t xen_make_pte(unsigned long long pte)
2228
pte = phys_to_machine(XPADDR(pte)).maddr;
2230
- pte &= ~_PAGE_PCD;
2232
return (pte_t){ pte, pte >> 32 };
2235
@@ -293,8 +291,6 @@ pte_t xen_make_pte(unsigned long pte)
2236
if (pte & _PAGE_PRESENT)
2237
pte = phys_to_machine(XPADDR(pte)).maddr;
2239
- pte &= ~_PAGE_PCD;
2241
return (pte_t){ pte };
2244
diff --git a/block/blktrace.c b/block/blktrace.c
2245
index 498a0a5..d00ac39 100644
2246
--- a/block/blktrace.c
2247
+++ b/block/blktrace.c
2248
@@ -202,7 +202,6 @@ static void blk_remove_tree(struct dentry *dir)
2249
static struct dentry *blk_create_tree(const char *blk_name)
2251
struct dentry *dir = NULL;
2254
mutex_lock(&blk_tree_mutex);
2256
@@ -210,17 +209,13 @@ static struct dentry *blk_create_tree(const char *blk_name)
2257
blk_tree_root = debugfs_create_dir("block", NULL);
2263
dir = debugfs_create_dir(blk_name, blk_tree_root);
2267
- /* Delete root only if we created it */
2269
- blk_remove_root();
2272
+ blk_remove_root();
2275
mutex_unlock(&blk_tree_mutex);
2276
diff --git a/block/genhd.c b/block/genhd.c
2277
index f2ac914..e609996 100644
2280
@@ -715,7 +715,6 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
2281
disk->part = kmalloc_node(size,
2282
GFP_KERNEL | __GFP_ZERO, node_id);
2284
- free_disk_stats(disk);
2288
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
2289
index 8b91994..3b927be 100644
2290
--- a/block/ll_rw_blk.c
2291
+++ b/block/ll_rw_blk.c
2292
@@ -4080,7 +4080,23 @@ static ssize_t queue_max_hw_sectors_show(struct request_queue *q, char *page)
2293
return queue_var_show(max_hw_sectors_kb, (page));
2296
+static ssize_t queue_max_segments_show(struct request_queue *q, char *page)
2298
+ return queue_var_show(q->max_phys_segments, page);
2301
+static ssize_t queue_max_segments_store(struct request_queue *q,
2302
+ const char *page, size_t count)
2304
+ unsigned long segments;
2305
+ ssize_t ret = queue_var_store(&segments, page, count);
2307
+ spin_lock_irq(q->queue_lock);
2308
+ q->max_phys_segments = segments;
2309
+ spin_unlock_irq(q->queue_lock);
2313
static struct queue_sysfs_entry queue_requests_entry = {
2314
.attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
2315
.show = queue_requests_show,
2316
@@ -4104,6 +4120,12 @@ static struct queue_sysfs_entry queue_max_hw_sectors_entry = {
2317
.show = queue_max_hw_sectors_show,
2320
+static struct queue_sysfs_entry queue_max_segments_entry = {
2321
+ .attr = {.name = "max_segments", .mode = S_IRUGO | S_IWUSR },
2322
+ .show = queue_max_segments_show,
2323
+ .store = queue_max_segments_store,
2326
static struct queue_sysfs_entry queue_iosched_entry = {
2327
.attr = {.name = "scheduler", .mode = S_IRUGO | S_IWUSR },
2328
.show = elv_iosched_show,
2329
@@ -4115,6 +4137,7 @@ static struct attribute *default_attrs[] = {
2330
&queue_ra_entry.attr,
2331
&queue_max_hw_sectors_entry.attr,
2332
&queue_max_sectors_entry.attr,
2333
+ &queue_max_segments_entry.attr,
2334
&queue_iosched_entry.attr,
2337
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
2338
index e48ee4f..015689d 100644
2339
--- a/drivers/acpi/processor_core.c
2340
+++ b/drivers/acpi/processor_core.c
2341
@@ -494,7 +494,7 @@ static int get_cpu_id(acpi_handle handle, u32 acpi_id)
2345
- for_each_possible_cpu(i) {
2346
+ for (i = 0; i < NR_CPUS; ++i) {
2347
if (cpu_physical_id(i) == apic_id)
2350
@@ -632,7 +632,7 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
2354
- BUG_ON((pr->id >= nr_cpu_ids) || (pr->id < 0));
2355
+ BUG_ON((pr->id >= NR_CPUS) || (pr->id < 0));
2359
@@ -774,7 +774,7 @@ static int acpi_processor_remove(struct acpi_device *device, int type)
2361
pr = acpi_driver_data(device);
2363
- if (pr->id >= nr_cpu_ids) {
2364
+ if (pr->id >= NR_CPUS) {
2368
@@ -845,7 +845,7 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
2372
- if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
2373
+ if ((pr->id >= 0) && (pr->id < NR_CPUS)) {
2374
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
2377
@@ -883,13 +883,13 @@ acpi_processor_hotplug_notify(acpi_handle handle, u32 event, void *data)
2381
- if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
2382
+ if (pr->id >= 0 && (pr->id < NR_CPUS)) {
2383
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
2387
result = acpi_processor_start(device);
2388
- if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
2389
+ if ((!result) && ((pr->id >= 0) && (pr->id < NR_CPUS))) {
2390
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
2392
printk(KERN_ERR PREFIX "Device [%s] failed to start\n",
2393
@@ -912,7 +912,7 @@ acpi_processor_hotplug_notify(acpi_handle handle, u32 event, void *data)
2397
- if ((pr->id < nr_cpu_ids) && (cpu_present(pr->id)))
2398
+ if ((pr->id < NR_CPUS) && (cpu_present(pr->id)))
2399
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
2402
diff --git a/drivers/base/core.c b/drivers/base/core.c
2403
index 2683eac..3f4d6aa 100644
2404
--- a/drivers/base/core.c
2405
+++ b/drivers/base/core.c
2406
@@ -770,10 +770,9 @@ int device_add(struct device *dev)
2407
error = device_add_attrs(dev);
2410
- error = dpm_sysfs_add(dev);
2411
+ error = device_pm_add(dev);
2414
- device_pm_add(dev);
2415
error = bus_add_device(dev);
2418
@@ -798,7 +797,6 @@ int device_add(struct device *dev)
2421
device_pm_remove(dev);
2422
- dpm_sysfs_remove(dev);
2425
blocking_notifier_call_chain(&dev->bus->bus_notifier,
2426
diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile
2427
index 44504e6..a803733 100644
2428
--- a/drivers/base/power/Makefile
2429
+++ b/drivers/base/power/Makefile
2432
-obj-$(CONFIG_PM) += sysfs.o
2433
-obj-$(CONFIG_PM_SLEEP) += main.o
2434
+obj-$(CONFIG_PM_SLEEP) += main.o sysfs.o
2435
obj-$(CONFIG_PM_TRACE) += trace.o
2437
ifeq ($(CONFIG_DEBUG_DRIVER),y)
2438
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
2439
index 691ffb6..0ab4ab2 100644
2440
--- a/drivers/base/power/main.c
2441
+++ b/drivers/base/power/main.c
2442
@@ -38,14 +38,20 @@ static DEFINE_MUTEX(dpm_list_mtx);
2443
int (*platform_enable_wakeup)(struct device *dev, int is_on);
2446
-void device_pm_add(struct device *dev)
2447
+int device_pm_add(struct device *dev)
2451
pr_debug("PM: Adding info for %s:%s\n",
2452
dev->bus ? dev->bus->name : "No Bus",
2453
kobject_name(&dev->kobj));
2454
mutex_lock(&dpm_list_mtx);
2455
list_add_tail(&dev->power.entry, &dpm_active);
2456
+ error = dpm_sysfs_add(dev);
2458
+ list_del(&dev->power.entry);
2459
mutex_unlock(&dpm_list_mtx);
2463
void device_pm_remove(struct device *dev)
2464
diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
2465
index 379da4e..5c4efd4 100644
2466
--- a/drivers/base/power/power.h
2467
+++ b/drivers/base/power/power.h
2468
@@ -13,29 +13,14 @@ extern void device_shutdown(void);
2470
extern struct list_head dpm_active; /* The active device list */
2472
-static inline struct device *to_device(struct list_head *entry)
2473
+static inline struct device * to_device(struct list_head * entry)
2475
return container_of(entry, struct device, power.entry);
2478
-extern void device_pm_add(struct device *);
2479
+extern int device_pm_add(struct device *);
2480
extern void device_pm_remove(struct device *);
2482
-#else /* CONFIG_PM_SLEEP */
2485
-static inline void device_pm_add(struct device *dev)
2489
-static inline void device_pm_remove(struct device *dev)
2500
@@ -43,15 +28,16 @@ static inline void device_pm_remove(struct device *dev)
2501
extern int dpm_sysfs_add(struct device *);
2502
extern void dpm_sysfs_remove(struct device *);
2504
-#else /* CONFIG_PM */
2505
+#else /* CONFIG_PM_SLEEP */
2508
-static inline int dpm_sysfs_add(struct device *dev)
2509
+static inline int device_pm_add(struct device * dev)
2514
-static inline void dpm_sysfs_remove(struct device *dev)
2515
+static inline void device_pm_remove(struct device * dev)
2521
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
2522
index a509b8d..bf18d75 100644
2523
--- a/drivers/char/Kconfig
2524
+++ b/drivers/char/Kconfig
2525
@@ -457,7 +457,7 @@ config LEGACY_PTYS
2526
config LEGACY_PTY_COUNT
2527
int "Maximum number of legacy PTY in use"
2528
depends on LEGACY_PTYS
2533
The maximum number of legacy PTYs that can be used at any one time.
2534
diff --git a/drivers/char/sonypi.c b/drivers/char/sonypi.c
2535
index 921c6d2..877e53d 100644
2536
--- a/drivers/char/sonypi.c
2537
+++ b/drivers/char/sonypi.c
2538
@@ -1163,7 +1163,7 @@ static struct acpi_driver sonypi_acpi_driver = {
2542
-static int __devinit sonypi_create_input_devices(struct platform_device *pdev)
2543
+static int __devinit sonypi_create_input_devices(void)
2545
struct input_dev *jog_dev;
2546
struct input_dev *key_dev;
2547
@@ -1177,7 +1177,6 @@ static int __devinit sonypi_create_input_devices(struct platform_device *pdev)
2548
jog_dev->name = "Sony Vaio Jogdial";
2549
jog_dev->id.bustype = BUS_ISA;
2550
jog_dev->id.vendor = PCI_VENDOR_ID_SONY;
2551
- jog_dev->dev.parent = &pdev->dev;
2553
jog_dev->evbit[0] = BIT_MASK(EV_KEY) | BIT_MASK(EV_REL);
2554
jog_dev->keybit[BIT_WORD(BTN_MOUSE)] = BIT_MASK(BTN_MIDDLE);
2555
@@ -1192,7 +1191,6 @@ static int __devinit sonypi_create_input_devices(struct platform_device *pdev)
2556
key_dev->name = "Sony Vaio Keys";
2557
key_dev->id.bustype = BUS_ISA;
2558
key_dev->id.vendor = PCI_VENDOR_ID_SONY;
2559
- key_dev->dev.parent = &pdev->dev;
2561
/* Initialize the Input Drivers: special keys */
2562
key_dev->evbit[0] = BIT_MASK(EV_KEY);
2563
@@ -1387,7 +1385,7 @@ static int __devinit sonypi_probe(struct platform_device *dev)
2567
- error = sonypi_create_input_devices(dev);
2568
+ error = sonypi_create_input_devices();
2571
"sonypi: failed to create input devices\n");
2572
@@ -1434,7 +1432,7 @@ static int __devexit sonypi_remove(struct platform_device *dev)
2576
- synchronize_irq(sonypi_device.irq);
2577
+ synchronize_sched(); /* Allow sonypi interrupt to complete. */
2578
flush_scheduled_work();
2581
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
2582
index 81503d9..fd771a4 100644
2583
--- a/drivers/char/tpm/tpm_tis.c
2584
+++ b/drivers/char/tpm/tpm_tis.c
2585
@@ -450,11 +450,6 @@ static int tpm_tis_init(struct device *dev, resource_size_t start,
2589
- if (request_locality(chip, 0) != 0) {
2594
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
2596
/* Default timeouts */
2597
@@ -492,6 +487,11 @@ static int tpm_tis_init(struct device *dev, resource_size_t start,
2598
if (intfcaps & TPM_INTF_DATA_AVAIL_INT)
2599
dev_dbg(dev, "\tData Avail Int Support\n");
2601
+ if (request_locality(chip, 0) != 0) {
2606
/* INTERRUPT Setup */
2607
init_waitqueue_head(&chip->vendor.read_queue);
2608
init_waitqueue_head(&chip->vendor.int_queue);
2609
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
2610
index c46b7c2..6a7d25f 100644
2611
--- a/drivers/dma/Kconfig
2612
+++ b/drivers/dma/Kconfig
2616
menuconfig DMADEVICES
2617
- bool "DMA Engine support"
2618
+ bool "DMA Offload Engine support"
2619
depends on (PCI && X86) || ARCH_IOP32X || ARCH_IOP33X || ARCH_IOP13XX
2621
- DMA engines can do asynchronous data transfers without
2622
- involving the host CPU. Currently, this framework can be
2623
- used to offload memory copies in the network stack and
2624
- RAID operations in the MD driver.
2625
+ Intel(R) offload engines enable offloading memory copies in the
2626
+ network stack and RAID operations in the MD driver.
2630
diff --git a/drivers/ide/Kconfig b/drivers/ide/Kconfig
2631
index 45b2228..e445fe6 100644
2632
--- a/drivers/ide/Kconfig
2633
+++ b/drivers/ide/Kconfig
2634
@@ -313,6 +313,7 @@ comment "IDE chipset support/bugfixes"
2637
tristate "generic/default IDE chipset support"
2642
@@ -483,7 +484,6 @@ config WDC_ALI15X3
2644
config BLK_DEV_AMD74XX
2645
tristate "AMD and nVidia IDE support"
2647
select BLK_DEV_IDEDMA_PCI
2649
This driver adds explicit support for AMD-7xx and AMD-8111 chips
2650
@@ -883,49 +883,6 @@ config BLK_DEV_IDE_BAST
2651
Say Y here if you want to support the onboard IDE channels on the
2652
Simtec BAST or the Thorcom VR1000
2655
- bool "ETRAX IDE support"
2656
- depends on CRIS && BROKEN
2657
- select BLK_DEV_IDEDMA
2658
- select IDE_GENERIC
2660
- Enables the ETRAX IDE driver.
2662
- You can't use parallel ports or SCSI ports at the same time.
2664
-config ETRAX_IDE_DELAY
2665
- int "Delay for drives to regain consciousness"
2666
- depends on ETRAX_IDE && ETRAX_ARCH_V10
2669
- Number of seconds to wait for IDE drives to spin up after an IDE
2673
- prompt "IDE reset pin"
2674
- depends on ETRAX_IDE && ETRAX_ARCH_V10
2675
- default ETRAX_IDE_PB7_RESET
2677
-config ETRAX_IDE_PB7_RESET
2678
- bool "Port_PB_Bit_7"
2680
- IDE reset on pin 7 on port B
2682
-config ETRAX_IDE_G27_RESET
2683
- bool "Port_G_Bit_27"
2685
- IDE reset on pin 27 on port G
2690
- bool "H8300 IDE support"
2692
- select IDE_GENERIC
2695
- Enables the H8300 IDE driver.
2697
config BLK_DEV_GAYLE
2698
bool "Amiga Gayle IDE interface support"
2700
@@ -1006,7 +963,7 @@ config BLK_DEV_Q40IDE
2702
config BLK_DEV_MPC8xx_IDE
2703
bool "MPC8xx IDE support"
2704
- depends on 8xx && (LWMON || IVMS8 || IVML24 || TQM8xxL) && IDE=y && BLK_DEV_IDE=y && !PPC_MERGE
2705
+ depends on 8xx && IDE=y && BLK_DEV_IDE=y && !PPC_MERGE
2708
This option provides support for IDE on Motorola MPC8xx Systems.
2709
diff --git a/drivers/ide/Makefile b/drivers/ide/Makefile
2710
index b181fc6..75dc696 100644
2711
--- a/drivers/ide/Makefile
2712
+++ b/drivers/ide/Makefile
2713
@@ -39,7 +39,7 @@ ide-core-$(CONFIG_BLK_DEV_MPC8xx_IDE) += ppc/mpc8xx.o
2714
ide-core-$(CONFIG_BLK_DEV_IDE_PMAC) += ppc/pmac.o
2716
# built-in only drivers from h8300/
2717
-ide-core-$(CONFIG_IDE_H8300) += h8300/ide-h8300.o
2718
+ide-core-$(CONFIG_H8300) += h8300/ide-h8300.o
2720
obj-$(CONFIG_BLK_DEV_IDE) += ide-core.o
2721
obj-$(CONFIG_IDE_GENERIC) += ide-generic.o
2722
diff --git a/drivers/ide/cris/ide-cris.c b/drivers/ide/cris/ide-cris.c
2723
index 476e0d6..7f5bc2e 100644
2724
--- a/drivers/ide/cris/ide-cris.c
2725
+++ b/drivers/ide/cris/ide-cris.c
2726
@@ -773,16 +773,15 @@ init_e100_ide (void)
2727
/* the IDE control register is at ATA address 6, with CS1 active instead of CS0 */
2728
ide_offsets[IDE_CONTROL_OFFSET] = cris_ide_reg_addr(6, 1, 0);
2730
- for (h = 0; h < 4; h++) {
2731
- ide_hwif_t *hwif = NULL;
2732
+ /* first fill in some stuff in the ide_hwifs fields */
2734
+ for(h = 0; h < MAX_HWIFS; h++) {
2735
+ ide_hwif_t *hwif = &ide_hwifs[h];
2736
ide_setup_ports(&hw, cris_ide_base_address(h),
2738
0, 0, cris_ide_ack_intr,
2739
ide_default_irq(0));
2740
ide_register_hw(&hw, NULL, 1, &hwif);
2744
hwif->chipset = ide_etrax100;
2745
hwif->set_pio_mode = &cris_set_pio_mode;
2746
diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c
2747
index 0d795a1..e3add70 100644
2748
--- a/drivers/ide/ide-dma.c
2749
+++ b/drivers/ide/ide-dma.c
2750
@@ -130,7 +130,6 @@ static const struct drive_list_entry drive_blacklist [] = {
2751
{ "_NEC DV5800A", NULL },
2752
{ "SAMSUNG CD-ROM SN-124", "N001" },
2753
{ "Seagate STT20000A", NULL },
2754
- { "CD-ROM CDR_U200", "1.09" },
2758
diff --git a/drivers/ide/ide-iops.c b/drivers/ide/ide-iops.c
2759
index 5c32561..e17a9ee 100644
2760
--- a/drivers/ide/ide-iops.c
2761
+++ b/drivers/ide/ide-iops.c
2762
@@ -303,6 +303,9 @@ void default_hwif_transport(ide_hwif_t *hwif)
2763
hwif->atapi_output_bytes = atapi_output_bytes;
2767
+ * Beginning of Taskfile OPCODE Library and feature sets.
2769
void ide_fix_driveid (struct hd_driveid *id)
2771
#ifndef __LITTLE_ENDIAN
2772
@@ -589,9 +592,6 @@ EXPORT_SYMBOL_GPL(ide_in_drive_list);
2773
static const struct drive_list_entry ivb_list[] = {
2774
{ "QUANTUM FIREBALLlct10 05" , "A03.0900" },
2775
{ "TSSTcorp CDDVDW SH-S202J" , "SB00" },
2776
- { "TSSTcorp CDDVDW SH-S202J" , "SB01" },
2777
- { "TSSTcorp CDDVDW SH-S202N" , "SB00" },
2778
- { "TSSTcorp CDDVDW SH-S202N" , "SB01" },
2782
@@ -756,7 +756,7 @@ int ide_driveid_update(ide_drive_t *drive)
2783
int ide_config_drive_speed(ide_drive_t *drive, u8 speed)
2785
ide_hwif_t *hwif = drive->hwif;
2790
// while (HWGROUP(drive)->busy)
2791
@@ -767,10 +767,6 @@ int ide_config_drive_speed(ide_drive_t *drive, u8 speed)
2792
hwif->dma_host_off(drive);
2795
- /* Skip setting PIO flow-control modes on pre-EIDE drives */
2796
- if ((speed & 0xf8) == XFER_PIO_0 && !(drive->id->capability & 0x08))
2800
* Don't use ide_wait_cmd here - it will
2801
* attempt to set_geometry and recalibrate,
2802
@@ -818,7 +814,6 @@ int ide_config_drive_speed(ide_drive_t *drive, u8 speed)
2803
drive->id->dma_mword &= ~0x0F00;
2804
drive->id->dma_1word &= ~0x0F00;
2807
#ifdef CONFIG_BLK_DEV_IDEDMA
2808
if (speed >= XFER_SW_DMA_0)
2809
hwif->dma_host_on(drive);
2810
diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
2811
index ee848c7..56fb0b8 100644
2812
--- a/drivers/ide/ide-probe.c
2813
+++ b/drivers/ide/ide-probe.c
2814
@@ -644,7 +644,7 @@ static void hwif_register (ide_hwif_t *hwif)
2816
static int wait_hwif_ready(ide_hwif_t *hwif)
2821
printk(KERN_DEBUG "Probing IDE interface %s...\n", hwif->name);
2823
@@ -661,26 +661,20 @@ static int wait_hwif_ready(ide_hwif_t *hwif)
2826
/* Now make sure both master & slave are ready */
2827
- for (unit = 0; unit < MAX_DRIVES; unit++) {
2828
- ide_drive_t *drive = &hwif->drives[unit];
2829
+ SELECT_DRIVE(&hwif->drives[0]);
2830
+ hwif->OUTB(8, hwif->io_ports[IDE_CONTROL_OFFSET]);
2832
+ rc = ide_wait_not_busy(hwif, 35000);
2835
+ SELECT_DRIVE(&hwif->drives[1]);
2836
+ hwif->OUTB(8, hwif->io_ports[IDE_CONTROL_OFFSET]);
2838
+ rc = ide_wait_not_busy(hwif, 35000);
2840
- /* Ignore disks that we will not probe for later. */
2841
- if (!drive->noprobe || drive->present) {
2842
- SELECT_DRIVE(drive);
2843
- hwif->OUTB(8, hwif->io_ports[IDE_CONTROL_OFFSET]);
2845
- rc = ide_wait_not_busy(hwif, 35000);
2849
- printk(KERN_DEBUG "%s: ide_wait_not_busy() skipped\n",
2853
/* Exit function with master reselected (let's be sane) */
2855
- SELECT_DRIVE(&hwif->drives[0]);
2857
+ SELECT_DRIVE(&hwif->drives[0]);
2862
diff --git a/drivers/ide/legacy/ali14xx.c b/drivers/ide/legacy/ali14xx.c
2863
index 38c3a6d..10311ec 100644
2864
--- a/drivers/ide/legacy/ali14xx.c
2865
+++ b/drivers/ide/legacy/ali14xx.c
2868
/* port addresses for auto-detection */
2869
#define ALI_NUM_PORTS 4
2870
-static const int ports[ALI_NUM_PORTS] __initdata =
2871
- { 0x074, 0x0f4, 0x034, 0x0e4 };
2872
+static int ports[ALI_NUM_PORTS] __initdata = {0x074, 0x0f4, 0x034, 0x0e4};
2874
/* register initialization data */
2875
typedef struct { u8 reg, data; } RegInitializer;
2877
-static const RegInitializer initData[] __initdata = {
2878
+static RegInitializer initData[] __initdata = {
2879
{0x01, 0x0f}, {0x02, 0x00}, {0x03, 0x00}, {0x04, 0x00},
2880
{0x05, 0x00}, {0x06, 0x00}, {0x07, 0x2b}, {0x0a, 0x0f},
2881
{0x25, 0x00}, {0x26, 0x00}, {0x27, 0x00}, {0x28, 0x00},
2882
@@ -178,7 +177,7 @@ static int __init findPort (void)
2883
* Initialize controller registers with default values.
2885
static int __init initRegisters (void) {
2886
- const RegInitializer *p;
2887
+ RegInitializer *p;
2889
unsigned long flags;
2891
diff --git a/drivers/ide/legacy/macide.c b/drivers/ide/legacy/macide.c
2892
index 5c6aa77..e87cd2f 100644
2893
--- a/drivers/ide/legacy/macide.c
2894
+++ b/drivers/ide/legacy/macide.c
2895
@@ -81,7 +81,7 @@ int macide_ack_intr(ide_hwif_t* hwif)
2896
* Probe for a Macintosh IDE interface
2899
-void __init macide_init(void)
2900
+void macide_init(void)
2904
diff --git a/drivers/ide/legacy/q40ide.c b/drivers/ide/legacy/q40ide.c
2905
index 6ea46a6..a73db1b 100644
2906
--- a/drivers/ide/legacy/q40ide.c
2907
+++ b/drivers/ide/legacy/q40ide.c
2908
@@ -111,7 +111,7 @@ static const char *q40_ide_names[Q40IDE_NUM_HWIFS]={
2909
* Probe for Q40 IDE interfaces
2912
-void __init q40ide_init(void)
2913
+void q40ide_init(void)
2917
diff --git a/drivers/ide/pci/aec62xx.c b/drivers/ide/pci/aec62xx.c
2918
index 4426850..19ec421 100644
2919
--- a/drivers/ide/pci/aec62xx.c
2920
+++ b/drivers/ide/pci/aec62xx.c
2921
@@ -260,11 +260,6 @@ static int __devinit aec62xx_init_one(struct pci_dev *dev, const struct pci_devi
2923
struct ide_port_info d;
2924
u8 idx = id->driver_data;
2927
- err = pci_enable_device(dev);
2931
d = aec62xx_chipsets[idx];
2933
@@ -277,11 +272,7 @@ static int __devinit aec62xx_init_one(struct pci_dev *dev, const struct pci_devi
2937
- err = ide_setup_pci_device(dev, &d);
2939
- pci_disable_device(dev);
2942
+ return ide_setup_pci_device(dev, &d);
2945
static const struct pci_device_id aec62xx_pci_tbl[] = {
2946
diff --git a/drivers/ide/pci/alim15x3.c b/drivers/ide/pci/alim15x3.c
2947
index ce29393..a607dd3 100644
2948
--- a/drivers/ide/pci/alim15x3.c
2949
+++ b/drivers/ide/pci/alim15x3.c
2950
@@ -603,11 +603,6 @@ static int ali_cable_override(struct pci_dev *pdev)
2951
pdev->subsystem_device == 0x10AF)
2954
- /* Mitac 8317 (Winbook-A) and relatives */
2955
- if (pdev->subsystem_vendor == 0x1071 &&
2956
- pdev->subsystem_device == 0x8317)
2959
/* Systems by DMI */
2960
if (dmi_check_system(cable_dmi_table))
2962
diff --git a/drivers/ide/pci/piix.c b/drivers/ide/pci/piix.c
2963
index 27781d2..63625a0 100644
2964
--- a/drivers/ide/pci/piix.c
2965
+++ b/drivers/ide/pci/piix.c
2966
@@ -306,7 +306,6 @@ static const struct ich_laptop ich_laptop[] = {
2967
{ 0x27DF, 0x0005, 0x0280 }, /* ICH7 on Acer 5602WLMi */
2968
{ 0x27DF, 0x1025, 0x0110 }, /* ICH7 on Acer 3682WLMi */
2969
{ 0x27DF, 0x1043, 0x1267 }, /* ICH7 on Asus W5F */
2970
- { 0x27DF, 0x103C, 0x30A1 }, /* ICH7 on HP Compaq nc2400 */
2971
{ 0x24CA, 0x1025, 0x0061 }, /* ICH4 on Acer Aspire 2023WLMi */
2974
diff --git a/drivers/ide/pci/siimage.c b/drivers/ide/pci/siimage.c
2975
index 5709c25..6d99441 100644
2976
--- a/drivers/ide/pci/siimage.c
2977
+++ b/drivers/ide/pci/siimage.c
2980
- * linux/drivers/ide/pci/siimage.c Version 1.19 Nov 16 2007
2981
+ * linux/drivers/ide/pci/siimage.c Version 1.18 Oct 18 2007
2983
* Copyright (C) 2001-2002 Andre Hedrick <andre@linux-ide.org>
2984
* Copyright (C) 2003 Red Hat <alan@redhat.com>
2985
@@ -460,6 +460,48 @@ static void sil_sata_pre_reset(ide_drive_t *drive)
2989
+ * siimage_reset - reset a device on an siimage controller
2990
+ * @drive: drive to reset
2992
+ * Perform a controller level reset fo the device. For
2993
+ * SATA we must also check the PHY.
2996
+static void siimage_reset (ide_drive_t *drive)
2998
+ ide_hwif_t *hwif = HWIF(drive);
3000
+ unsigned long addr = siimage_selreg(hwif, 0);
3003
+ reset = hwif->INB(addr);
3004
+ hwif->OUTB((reset|0x03), addr);
3005
+ /* FIXME:posting */
3007
+ hwif->OUTB(reset, addr);
3008
+ (void) hwif->INB(addr);
3010
+ pci_read_config_byte(hwif->pci_dev, addr, &reset);
3011
+ pci_write_config_byte(hwif->pci_dev, addr, reset|0x03);
3013
+ pci_write_config_byte(hwif->pci_dev, addr, reset);
3014
+ pci_read_config_byte(hwif->pci_dev, addr, &reset);
3017
+ if (SATA_STATUS_REG) {
3018
+ /* SATA_STATUS_REG is valid only when in MMIO mode */
3019
+ u32 sata_stat = readl((void __iomem *)SATA_STATUS_REG);
3020
+ printk(KERN_WARNING "%s: reset phy, status=0x%08x, %s\n",
3021
+ hwif->name, sata_stat, __FUNCTION__);
3022
+ if (!(sata_stat)) {
3023
+ printk(KERN_WARNING "%s: reset phy dead, status=0x%08x\n",
3024
+ hwif->name, sata_stat);
3025
+ drive->failures++;
3031
* proc_reports_siimage - add siimage controller to proc
3033
* @clocking: SCSC value
3034
@@ -815,6 +857,7 @@ static void __devinit init_hwif_siimage(ide_hwif_t *hwif)
3036
u8 sata = is_sata(hwif);
3038
+ hwif->resetproc = &siimage_reset;
3039
hwif->set_pio_mode = &sil_set_pio_mode;
3040
hwif->set_dma_mode = &sil_set_dma_mode;
3042
diff --git a/drivers/ide/pci/sis5513.c b/drivers/ide/pci/sis5513.c
3043
index d90b429..f6e2ab3 100644
3044
--- a/drivers/ide/pci/sis5513.c
3045
+++ b/drivers/ide/pci/sis5513.c
3046
@@ -526,7 +526,6 @@ static const struct sis_laptop sis_laptop[] = {
3047
/* devid, subvendor, subdev */
3048
{ 0x5513, 0x1043, 0x1107 }, /* ASUS A6K */
3049
{ 0x5513, 0x1734, 0x105f }, /* FSC Amilo A1630 */
3050
- { 0x5513, 0x1071, 0x8640 }, /* EasyNote K5305 */
3054
diff --git a/drivers/ide/pci/trm290.c b/drivers/ide/pci/trm290.c
3055
index 0895e75..5011ba2 100644
3056
--- a/drivers/ide/pci/trm290.c
3057
+++ b/drivers/ide/pci/trm290.c
3058
@@ -240,6 +240,9 @@ static int trm290_ide_dma_test_irq (ide_drive_t *drive)
3059
return (status == 0x00ff);
3063
+ * Invoked from ide-dma.c at boot time.
3065
static void __devinit init_hwif_trm290(ide_hwif_t *hwif)
3067
unsigned int cfgbase = 0;
3068
diff --git a/drivers/ide/ppc/pmac.c b/drivers/ide/ppc/pmac.c
3069
index 7f7a598..5afdfef 100644
3070
--- a/drivers/ide/ppc/pmac.c
3071
+++ b/drivers/ide/ppc/pmac.c
3072
@@ -1513,7 +1513,7 @@ pmac_ide_build_dmatable(ide_drive_t *drive, struct request *rq)
3074
if (pmif->broken_dma && cur_addr & (L1_CACHE_BYTES - 1)) {
3075
if (pmif->broken_dma_warn == 0) {
3076
- printk(KERN_WARNING "%s: DMA on non aligned address, "
3077
+ printk(KERN_WARNING "%s: DMA on non aligned address,"
3078
"switching to PIO on Ohare chipset\n", drive->name);
3079
pmif->broken_dma_warn = 1;
3081
diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c
3082
index f7782c8..453eb99 100644
3083
--- a/drivers/infiniband/hw/ehca/ehca_av.c
3084
+++ b/drivers/infiniband/hw/ehca/ehca_av.c
3085
@@ -76,12 +76,8 @@ int ehca_calc_ipd(struct ehca_shca *shca, int port,
3087
link = ib_width_enum_to_int(pa.active_width) * pa.active_speed;
3090
- /* no need to throttle if path faster than link */
3093
- /* IPD = round((link / path) - 1) */
3094
- *ipd = ((link + (path >> 1)) / path) - 1;
3095
+ /* IPD = round((link / path) - 1) */
3096
+ *ipd = ((link + (path >> 1)) / path) - 1;
3100
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c
3101
index dd12668..2e3e654 100644
3102
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
3103
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
3104
@@ -1203,7 +1203,7 @@ static int internal_modify_qp(struct ib_qp *ibqp,
3105
mqpcb->service_level = attr->ah_attr.sl;
3106
update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL, 1);
3108
- if (ehca_calc_ipd(shca, mqpcb->prim_phys_port,
3109
+ if (ehca_calc_ipd(shca, my_qp->init_attr.port_num,
3110
attr->ah_attr.static_rate,
3111
&mqpcb->max_static_rate)) {
3113
@@ -1302,7 +1302,7 @@ static int internal_modify_qp(struct ib_qp *ibqp,
3114
mqpcb->source_path_bits_al = attr->alt_ah_attr.src_path_bits;
3115
mqpcb->service_level_al = attr->alt_ah_attr.sl;
3117
- if (ehca_calc_ipd(shca, mqpcb->alt_phys_port,
3118
+ if (ehca_calc_ipd(shca, my_qp->init_attr.port_num,
3119
attr->alt_ah_attr.static_rate,
3120
&mqpcb->max_static_rate_al)) {
3122
diff --git a/drivers/infiniband/hw/ipath/ipath_cq.c b/drivers/infiniband/hw/ipath/ipath_cq.c
3123
index d1380c7..08d8ae1 100644
3124
--- a/drivers/infiniband/hw/ipath/ipath_cq.c
3125
+++ b/drivers/infiniband/hw/ipath/ipath_cq.c
3126
@@ -395,9 +395,12 @@ int ipath_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
3130
- /* Check that we can write the offset to mmap. */
3132
+ * Return the address of the WC as the offset to mmap.
3133
+ * See ipath_mmap() for details.
3135
if (udata && udata->outlen >= sizeof(__u64)) {
3137
+ __u64 offset = (__u64) wc;
3139
ret = ib_copy_to_udata(udata, &offset, sizeof(offset));
3141
@@ -447,18 +450,6 @@ int ipath_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
3142
struct ipath_mmap_info *ip = cq->ip;
3144
ipath_update_mmap_info(dev, ip, sz, wc);
3147
- * Return the offset to mmap.
3148
- * See ipath_mmap() for details.
3150
- if (udata && udata->outlen >= sizeof(__u64)) {
3151
- ret = ib_copy_to_udata(udata, &ip->offset,
3152
- sizeof(ip->offset));
3157
spin_lock_irq(&dev->pending_lock);
3158
if (list_empty(&ip->pending_mmaps))
3159
list_add(&ip->pending_mmaps, &dev->pending_mmaps);
3160
diff --git a/drivers/infiniband/hw/ipath/ipath_qp.c b/drivers/infiniband/hw/ipath/ipath_qp.c
3161
index b997ff8..6a41fdb 100644
3162
--- a/drivers/infiniband/hw/ipath/ipath_qp.c
3163
+++ b/drivers/infiniband/hw/ipath/ipath_qp.c
3164
@@ -835,8 +835,7 @@ struct ib_qp *ipath_create_qp(struct ib_pd *ibpd,
3165
init_attr->qp_type);
3168
- vfree(qp->r_rq.wq);
3174
@@ -864,7 +863,7 @@ struct ib_qp *ipath_create_qp(struct ib_pd *ibpd,
3182
u32 s = sizeof(struct ipath_rwq) +
3183
@@ -876,7 +875,7 @@ struct ib_qp *ipath_create_qp(struct ib_pd *ibpd,
3186
ret = ERR_PTR(-ENOMEM);
3191
err = ib_copy_to_udata(udata, &(qp->ip->offset),
3192
@@ -908,11 +907,9 @@ struct ib_qp *ipath_create_qp(struct ib_pd *ibpd,
3197
- kref_put(&qp->ip->ref, ipath_release_mmap_info);
3199
- vfree(qp->r_rq.wq);
3200
- ipath_free_qp(&dev->qp_table, qp);
3203
+ vfree(qp->r_rq.wq);
3207
diff --git a/drivers/infiniband/hw/ipath/ipath_srq.c b/drivers/infiniband/hw/ipath/ipath_srq.c
3208
index 2fef36f..40c36ec 100644
3209
--- a/drivers/infiniband/hw/ipath/ipath_srq.c
3210
+++ b/drivers/infiniband/hw/ipath/ipath_srq.c
3211
@@ -59,7 +59,7 @@ int ipath_post_srq_receive(struct ib_srq *ibsrq, struct ib_recv_wr *wr,
3213
if ((unsigned) wr->num_sge > srq->rq.max_sge) {
3220
@@ -211,11 +211,11 @@ int ipath_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
3221
struct ib_udata *udata)
3223
struct ipath_srq *srq = to_isrq(ibsrq);
3224
- struct ipath_rwq *wq;
3227
if (attr_mask & IB_SRQ_MAX_WR) {
3228
struct ipath_rwq *owq;
3229
+ struct ipath_rwq *wq;
3230
struct ipath_rwqe *p;
3231
u32 sz, size, n, head, tail;
3233
@@ -236,20 +236,27 @@ int ipath_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
3237
- /* Check that we can write the offset to mmap. */
3239
+ * Return the address of the RWQ as the offset to mmap.
3240
+ * See ipath_mmap() for details.
3242
if (udata && udata->inlen >= sizeof(__u64)) {
3245
+ __u64 offset = (__u64) wq;
3247
ret = ib_copy_from_udata(&offset_addr, udata,
3248
sizeof(offset_addr));
3255
udata->outbuf = (void __user *) offset_addr;
3256
ret = ib_copy_to_udata(udata, &offset,
3266
spin_lock_irq(&srq->rq.lock);
3267
@@ -270,8 +277,10 @@ int ipath_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
3271
+ spin_unlock_irq(&srq->rq.lock);
3279
@@ -305,18 +314,6 @@ int ipath_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
3280
u32 s = sizeof(struct ipath_rwq) + size * sz;
3282
ipath_update_mmap_info(dev, ip, s, wq);
3285
- * Return the offset to mmap.
3286
- * See ipath_mmap() for details.
3288
- if (udata && udata->inlen >= sizeof(__u64)) {
3289
- ret = ib_copy_to_udata(udata, &ip->offset,
3290
- sizeof(ip->offset));
3295
spin_lock_irq(&dev->pending_lock);
3296
if (list_empty(&ip->pending_mmaps))
3297
list_add(&ip->pending_mmaps,
3298
@@ -331,12 +328,7 @@ int ipath_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
3299
srq->limit = attr->srq_limit;
3300
spin_unlock_irq(&srq->rq.lock);
3305
- spin_unlock_irq(&srq->rq.lock);
3311
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
3312
index c4c9984..74f77e7 100644
3313
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
3314
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
3315
@@ -302,10 +302,8 @@ static int ipath_post_one_send(struct ipath_qp *qp, struct ib_send_wr *wr)
3316
next = qp->s_head + 1;
3317
if (next >= qp->s_size)
3319
- if (next == qp->s_last) {
3323
+ if (next == qp->s_last)
3326
wqe = get_swqe_ptr(qp, qp->s_head);
3328
@@ -406,7 +404,7 @@ static int ipath_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
3330
if ((unsigned) wr->num_sge > qp->r_rq.max_sge) {
3337
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
3338
index c9f6077..a03a65e 100644
3339
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
3340
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
3341
@@ -460,9 +460,6 @@ static struct ipoib_path *path_rec_create(struct net_device *dev, void *gid)
3342
struct ipoib_dev_priv *priv = netdev_priv(dev);
3343
struct ipoib_path *path;
3345
- if (!priv->broadcast)
3348
path = kzalloc(sizeof *path, GFP_ATOMIC);
3351
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c
3352
index 4a17743..d687980 100644
3353
--- a/drivers/infiniband/ulp/iser/iser_memory.c
3354
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
3355
@@ -310,15 +310,13 @@ static unsigned int iser_data_buf_aligned_len(struct iser_data_buf *data,
3356
if (i + 1 < data->dma_nents) {
3357
next_addr = ib_sg_dma_address(ibdev, sg_next(sg));
3358
/* are i, i+1 fragments of the same page? */
3359
- if (end_addr == next_addr) {
3361
+ if (end_addr == next_addr)
3363
- } else if (!IS_4K_ALIGNED(end_addr)) {
3364
+ else if (!IS_4K_ALIGNED(end_addr)) {
3371
if (i == data->dma_nents)
3372
ret_len = cnt; /* loop ended */
3373
diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig
3374
index dfa6592..2316a01 100644
3375
--- a/drivers/input/keyboard/Kconfig
3376
+++ b/drivers/input/keyboard/Kconfig
3377
@@ -286,7 +286,7 @@ config KEYBOARD_MAPLE
3379
config KEYBOARD_BFIN
3380
tristate "Blackfin BF54x keypad support"
3381
- depends on (BF54x && !BF544)
3384
Say Y here if you want to use the BF54x keypad.
3386
diff --git a/drivers/input/keyboard/gpio_keys.c b/drivers/input/keyboard/gpio_keys.c
3387
index 6a9ca4b..3eddf52 100644
3388
--- a/drivers/input/keyboard/gpio_keys.c
3389
+++ b/drivers/input/keyboard/gpio_keys.c
3390
@@ -75,32 +75,16 @@ static int __devinit gpio_keys_probe(struct platform_device *pdev)
3392
for (i = 0; i < pdata->nbuttons; i++) {
3393
struct gpio_keys_button *button = &pdata->buttons[i];
3395
+ int irq = gpio_to_irq(button->gpio);
3396
unsigned int type = button->type ?: EV_KEY;
3398
- error = gpio_request(button->gpio, button->desc ?: "gpio_keys");
3400
- pr_err("gpio-keys: failed to request GPIO %d,"
3401
- " error %d\n", button->gpio, error);
3405
- error = gpio_direction_input(button->gpio);
3407
- pr_err("gpio-keys: failed to configure input"
3408
- " direction for GPIO %d, error %d\n",
3409
- button->gpio, error);
3410
- gpio_free(button->gpio);
3414
- irq = gpio_to_irq(button->gpio);
3417
- pr_err("gpio-keys: Unable to get irq number"
3418
- " for GPIO %d, error %d\n",
3421
+ "Unable to get irq number for GPIO %d,"
3423
button->gpio, error);
3424
- gpio_free(button->gpio);
3428
@@ -110,9 +94,9 @@ static int __devinit gpio_keys_probe(struct platform_device *pdev)
3429
button->desc ? button->desc : "gpio_keys",
3432
- pr_err("gpio-keys: Unable to claim irq %d; error %d\n",
3434
+ "gpio-keys: Unable to claim irq %d; error %d\n",
3436
- gpio_free(button->gpio);
3440
@@ -124,7 +108,8 @@ static int __devinit gpio_keys_probe(struct platform_device *pdev)
3442
error = input_register_device(input);
3444
- pr_err("gpio-keys: Unable to register input device, "
3446
+ "gpio-keys: Unable to register input device, "
3447
"error: %d\n", error);
3450
@@ -134,10 +119,8 @@ static int __devinit gpio_keys_probe(struct platform_device *pdev)
3454
- while (--i >= 0) {
3456
free_irq(gpio_to_irq(pdata->buttons[i].gpio), pdev);
3457
- gpio_free(pdata->buttons[i].gpio);
3460
platform_set_drvdata(pdev, NULL);
3461
input_free_device(input);
3462
@@ -156,7 +139,6 @@ static int __devexit gpio_keys_remove(struct platform_device *pdev)
3463
for (i = 0; i < pdata->nbuttons; i++) {
3464
int irq = gpio_to_irq(pdata->buttons[i].gpio);
3465
free_irq(irq, pdev);
3466
- gpio_free(pdata->buttons[i].gpio);
3469
input_unregister_device(input);
3470
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h
3471
index c5e68dc..f8fe421 100644
3472
--- a/drivers/input/serio/i8042-x86ia64io.h
3473
+++ b/drivers/input/serio/i8042-x86ia64io.h
3474
@@ -110,14 +110,6 @@ static struct dmi_system_id __initdata i8042_dmi_noloop_table[] = {
3475
DMI_MATCH(DMI_PRODUCT_VERSION, "5a"),
3479
- .ident = "Microsoft Virtual Machine",
3481
- DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
3482
- DMI_MATCH(DMI_PRODUCT_NAME, "Virtual Machine"),
3483
- DMI_MATCH(DMI_PRODUCT_VERSION, "VS2005R2"),
3489
diff --git a/drivers/isdn/hisax/hfcscard.c b/drivers/isdn/hisax/hfcscard.c
3490
index 909d670..57670dc 100644
3491
--- a/drivers/isdn/hisax/hfcscard.c
3492
+++ b/drivers/isdn/hisax/hfcscard.c
3493
@@ -118,7 +118,8 @@ hfcs_card_msg(struct IsdnCardState *cs, int mt, void *arg)
3496
delay = (75*HZ)/100 +1;
3497
- mod_timer(&cs->hw.hfcD.timer, jiffies + delay);
3498
+ cs->hw.hfcD.timer.expires = jiffies + delay;
3499
+ add_timer(&cs->hw.hfcD.timer);
3500
spin_lock_irqsave(&cs->lock, flags);
3503
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
3504
index 6569206..4086080 100644
3505
--- a/drivers/kvm/Kconfig
3506
+++ b/drivers/kvm/Kconfig
3509
menuconfig VIRTUALIZATION
3510
bool "Virtualization"
3512
+ depends on ARCH_SUPPORTS_KVM || X86
3515
Say Y here to get to see options for using your Linux host to run other
3516
@@ -16,7 +16,7 @@ if VIRTUALIZATION
3519
tristate "Kernel-based Virtual Machine (KVM) support"
3520
- depends on X86 && EXPERIMENTAL
3521
+ depends on ARCH_SUPPORTS_KVM && EXPERIMENTAL
3522
select PREEMPT_NOTIFIERS
3525
diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
3526
index e5a8f4d..cf18ad4 100644
3527
--- a/drivers/kvm/Makefile
3528
+++ b/drivers/kvm/Makefile
3530
# Makefile for Kernel-based Virtual Machine module
3533
-kvm-objs := kvm_main.o mmu.o x86_emulate.o i8259.o irq.o lapic.o ioapic.o
3534
+kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o ioapic.o
3535
obj-$(CONFIG_KVM) += kvm.o
3536
kvm-intel-objs = vmx.o
3537
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
3538
diff --git a/drivers/kvm/i8259.c b/drivers/kvm/i8259.c
3539
index a679157..f0dc2ee 100644
3540
--- a/drivers/kvm/i8259.c
3541
+++ b/drivers/kvm/i8259.c
3542
@@ -181,10 +181,8 @@ int kvm_pic_read_irq(struct kvm_pic *s)
3546
-static void pic_reset(void *opaque)
3547
+void kvm_pic_reset(struct kvm_kpic_state *s)
3549
- struct kvm_kpic_state *s = opaque;
3554
@@ -209,7 +207,7 @@ static void pic_ioport_write(void *opaque, u32 addr, u32 val)
3558
- pic_reset(s); /* init */
3559
+ kvm_pic_reset(s); /* init */
3561
* deassert a pending interrupt
3563
diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
3564
index c7992e6..e7debfa 100644
3565
--- a/drivers/kvm/ioapic.c
3566
+++ b/drivers/kvm/ioapic.c
3573
#include <linux/kvm.h>
3574
#include <linux/mm.h>
3575
#include <linux/highmem.h>
3577
#include <linux/hrtimer.h>
3578
#include <linux/io.h>
3579
#include <asm/processor.h>
3580
-#include <asm/msr.h>
3581
#include <asm/page.h>
3582
#include <asm/current.h>
3583
-#include <asm/apicdef.h>
3584
-#include <asm/io_apic.h>
3586
-/* #define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */
3588
+#define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg)
3590
#define ioapic_debug(fmt, arg...)
3592
static void ioapic_deliver(struct kvm_ioapic *vioapic, int irq);
3594
static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
3595
@@ -113,7 +115,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
3597
index = (ioapic->ioregsel - 0x10) >> 1;
3599
- ioapic_debug("change redir index %x val %x", index, val);
3600
+ ioapic_debug("change redir index %x val %x\n", index, val);
3601
if (index >= IOAPIC_NUM_PINS)
3603
if (ioapic->ioregsel & 1) {
3604
@@ -131,16 +133,16 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
3607
static void ioapic_inj_irq(struct kvm_ioapic *ioapic,
3608
- struct kvm_lapic *target,
3609
+ struct kvm_vcpu *vcpu,
3610
u8 vector, u8 trig_mode, u8 delivery_mode)
3612
- ioapic_debug("irq %d trig %d deliv %d", vector, trig_mode,
3613
+ ioapic_debug("irq %d trig %d deliv %d\n", vector, trig_mode,
3616
- ASSERT((delivery_mode == dest_Fixed) ||
3617
- (delivery_mode == dest_LowestPrio));
3618
+ ASSERT((delivery_mode == IOAPIC_FIXED) ||
3619
+ (delivery_mode == IOAPIC_LOWEST_PRIORITY));
3621
- kvm_apic_set_irq(target, vector, trig_mode);
3622
+ kvm_apic_set_irq(vcpu, vector, trig_mode);
3625
static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
3626
@@ -151,7 +153,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
3627
struct kvm *kvm = ioapic->kvm;
3628
struct kvm_vcpu *vcpu;
3630
- ioapic_debug("dest %d dest_mode %d", dest, dest_mode);
3631
+ ioapic_debug("dest %d dest_mode %d\n", dest, dest_mode);
3633
if (dest_mode == 0) { /* Physical mode. */
3634
if (dest == 0xFF) { /* Broadcast. */
3635
@@ -179,7 +181,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
3636
kvm_apic_match_logical_addr(vcpu->apic, dest))
3637
mask |= 1 << vcpu->vcpu_id;
3639
- ioapic_debug("mask %x", mask);
3640
+ ioapic_debug("mask %x\n", mask);
3644
@@ -191,41 +193,39 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int irq)
3645
u8 vector = ioapic->redirtbl[irq].fields.vector;
3646
u8 trig_mode = ioapic->redirtbl[irq].fields.trig_mode;
3647
u32 deliver_bitmask;
3648
- struct kvm_lapic *target;
3649
struct kvm_vcpu *vcpu;
3652
ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
3653
- "vector=%x trig_mode=%x",
3654
+ "vector=%x trig_mode=%x\n",
3655
dest, dest_mode, delivery_mode, vector, trig_mode);
3657
deliver_bitmask = ioapic_get_delivery_bitmask(ioapic, dest, dest_mode);
3658
if (!deliver_bitmask) {
3659
- ioapic_debug("no target on destination");
3660
+ ioapic_debug("no target on destination\n");
3664
switch (delivery_mode) {
3665
- case dest_LowestPrio:
3667
- kvm_apic_round_robin(ioapic->kvm, vector, deliver_bitmask);
3668
- if (target != NULL)
3669
- ioapic_inj_irq(ioapic, target, vector,
3670
+ case IOAPIC_LOWEST_PRIORITY:
3671
+ vcpu = kvm_get_lowest_prio_vcpu(ioapic->kvm, vector,
3674
+ ioapic_inj_irq(ioapic, vcpu, vector,
3675
trig_mode, delivery_mode);
3677
- ioapic_debug("null round robin: "
3678
- "mask=%x vector=%x delivery_mode=%x",
3679
- deliver_bitmask, vector, dest_LowestPrio);
3680
+ ioapic_debug("null lowest prio vcpu: "
3681
+ "mask=%x vector=%x delivery_mode=%x\n",
3682
+ deliver_bitmask, vector, IOAPIC_LOWEST_PRIORITY);
3685
+ case IOAPIC_FIXED:
3686
for (vcpu_id = 0; deliver_bitmask != 0; vcpu_id++) {
3687
if (!(deliver_bitmask & (1 << vcpu_id)))
3689
deliver_bitmask &= ~(1 << vcpu_id);
3690
vcpu = ioapic->kvm->vcpus[vcpu_id];
3692
- target = vcpu->apic;
3693
- ioapic_inj_irq(ioapic, target, vector,
3694
+ ioapic_inj_irq(ioapic, vcpu, vector,
3695
trig_mode, delivery_mode);
3698
@@ -304,7 +304,7 @@ static void ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
3699
struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this->private;
3702
- ioapic_debug("addr %lx", (unsigned long)addr);
3703
+ ioapic_debug("addr %lx\n", (unsigned long)addr);
3704
ASSERT(!(addr & 0xf)); /* check alignment */
3707
@@ -341,8 +341,8 @@ static void ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
3708
struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this->private;
3711
- ioapic_debug("ioapic_mmio_write addr=%lx len=%d val=%p\n",
3713
+ ioapic_debug("ioapic_mmio_write addr=%p len=%d val=%p\n",
3714
+ (void*)addr, len, val);
3715
ASSERT(!(addr & 0xf)); /* check alignment */
3716
if (len == 4 || len == 8)
3717
data = *(u32 *) val;
3718
@@ -360,24 +360,38 @@ static void ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
3719
case IOAPIC_REG_WINDOW:
3720
ioapic_write_indirect(ioapic, data);
3723
+ case IOAPIC_REG_EOI:
3724
+ kvm_ioapic_update_eoi(ioapic, data);
3733
+void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
3737
+ for (i = 0; i < IOAPIC_NUM_PINS; i++)
3738
+ ioapic->redirtbl[i].fields.mask = 1;
3739
+ ioapic->base_address = IOAPIC_DEFAULT_BASE_ADDRESS;
3740
+ ioapic->ioregsel = 0;
3745
int kvm_ioapic_init(struct kvm *kvm)
3747
struct kvm_ioapic *ioapic;
3750
ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL);
3753
kvm->vioapic = ioapic;
3754
- for (i = 0; i < IOAPIC_NUM_PINS; i++)
3755
- ioapic->redirtbl[i].fields.mask = 1;
3756
- ioapic->base_address = IOAPIC_DEFAULT_BASE_ADDRESS;
3757
+ kvm_ioapic_reset(ioapic);
3758
ioapic->dev.read = ioapic_mmio_read;
3759
ioapic->dev.write = ioapic_mmio_write;
3760
ioapic->dev.in_range = ioapic_in_range;
3761
diff --git a/drivers/kvm/irq.c b/drivers/kvm/irq.c
3762
index 7628c7f..59b47c5 100644
3763
--- a/drivers/kvm/irq.c
3764
+++ b/drivers/kvm/irq.c
3766
#include <linux/module.h>
3773
diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h
3774
index 11fc014..75f5f18 100644
3775
--- a/drivers/kvm/irq.h
3776
+++ b/drivers/kvm/irq.h
3777
@@ -79,6 +79,14 @@ void kvm_pic_update_irq(struct kvm_pic *s);
3778
#define IOAPIC_REG_VERSION 0x01
3779
#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */
3781
+/*ioapic delivery mode*/
3782
+#define IOAPIC_FIXED 0x0
3783
+#define IOAPIC_LOWEST_PRIORITY 0x1
3784
+#define IOAPIC_PMI 0x2
3785
+#define IOAPIC_NMI 0x4
3786
+#define IOAPIC_INIT 0x5
3787
+#define IOAPIC_EXTINT 0x7
3792
@@ -139,18 +147,21 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
3793
int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
3794
int kvm_create_lapic(struct kvm_vcpu *vcpu);
3795
void kvm_lapic_reset(struct kvm_vcpu *vcpu);
3796
-void kvm_free_apic(struct kvm_lapic *apic);
3797
+void kvm_pic_reset(struct kvm_kpic_state *s);
3798
+void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
3799
+void kvm_free_lapic(struct kvm_vcpu *vcpu);
3800
u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
3801
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
3802
void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value);
3803
-struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
3805
+struct kvm_vcpu *kvm_get_lowest_prio_vcpu(struct kvm *kvm, u8 vector,
3806
unsigned long bitmap);
3807
u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
3808
void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data);
3809
int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
3810
void kvm_ioapic_update_eoi(struct kvm *kvm, int vector);
3811
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
3812
-int kvm_apic_set_irq(struct kvm_lapic *apic, u8 vec, u8 trig);
3813
+int kvm_apic_set_irq(struct kvm_vcpu *vcpu, u8 vec, u8 trig);
3814
void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu);
3815
int kvm_ioapic_init(struct kvm *kvm);
3816
void kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
3817
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
3818
index 3b0bc4b..be18620 100644
3819
--- a/drivers/kvm/kvm.h
3820
+++ b/drivers/kvm/kvm.h
3824
#include <linux/types.h>
3825
+#include <linux/hardirq.h>
3826
#include <linux/list.h>
3827
#include <linux/mutex.h>
3828
#include <linux/spinlock.h>
3830
#include <linux/kvm.h>
3831
#include <linux/kvm_para.h>
3833
-#define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1)
3834
-#define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD))
3835
-#define CR3_L_MODE_RESERVED_BITS (CR3_NONPAE_RESERVED_BITS|0xFFFFFF0000000000ULL)
3837
-#define KVM_GUEST_CR0_MASK \
3838
- (X86_CR0_PG | X86_CR0_PE | X86_CR0_WP | X86_CR0_NE \
3839
- | X86_CR0_NW | X86_CR0_CD)
3840
-#define KVM_VM_CR0_ALWAYS_ON \
3841
- (X86_CR0_PG | X86_CR0_PE | X86_CR0_WP | X86_CR0_NE | X86_CR0_TS \
3843
-#define KVM_GUEST_CR4_MASK \
3844
- (X86_CR4_VME | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_PGE | X86_CR4_VMXE)
3845
-#define KVM_PMODE_VM_CR4_ALWAYS_ON (X86_CR4_PAE | X86_CR4_VMXE)
3846
-#define KVM_RMODE_VM_CR4_ALWAYS_ON (X86_CR4_VME | X86_CR4_PAE | X86_CR4_VMXE)
3848
-#define INVALID_PAGE (~(hpa_t)0)
3849
-#define UNMAPPED_GVA (~(gpa_t)0)
3851
#define KVM_MAX_VCPUS 4
3852
#define KVM_ALIAS_SLOTS 4
3853
#define KVM_MEMORY_SLOTS 8
3854
+/* memory slots that does not exposed to userspace */
3855
+#define KVM_PRIVATE_MEM_SLOTS 4
3856
+#define KVM_PERMILLE_MMU_PAGES 20
3857
+#define KVM_MIN_ALLOC_MMU_PAGES 64
3858
#define KVM_NUM_MMU_PAGES 1024
3859
#define KVM_MIN_FREE_MMU_PAGES 5
3860
#define KVM_REFILL_PAGES 25
3861
#define KVM_MAX_CPUID_ENTRIES 40
3863
-#define DE_VECTOR 0
3864
-#define NM_VECTOR 7
3865
-#define DF_VECTOR 8
3866
-#define TS_VECTOR 10
3867
-#define NP_VECTOR 11
3868
-#define SS_VECTOR 12
3869
-#define GP_VECTOR 13
3870
-#define PF_VECTOR 14
3872
-#define SELECTOR_TI_MASK (1 << 2)
3873
-#define SELECTOR_RPL_MASK 0x03
3875
-#define IOPL_SHIFT 12
3877
#define KVM_PIO_PAGE_OFFSET 1
3880
* vcpu->requests bit members
3882
-#define KVM_TLB_FLUSH 0
3883
+#define KVM_REQ_TLB_FLUSH 0
3887
@@ -125,6 +98,8 @@ struct kvm_mmu_page {
3888
union kvm_mmu_page_role role;
3891
+ /* hold the gfn of each spte inside spt */
3893
unsigned long slot_bitmap; /* One bit set per slot which has memory
3894
* in this shadow page.
3896
@@ -149,6 +124,8 @@ struct kvm_mmu {
3897
int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
3898
void (*free)(struct kvm_vcpu *vcpu);
3899
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva);
3900
+ void (*prefetch_page)(struct kvm_vcpu *vcpu,
3901
+ struct kvm_mmu_page *page);
3904
int shadow_root_level;
3905
@@ -156,56 +133,23 @@ struct kvm_mmu {
3909
-#define KVM_NR_MEM_OBJS 20
3910
+#define KVM_NR_MEM_OBJS 40
3913
+ * We don't want allocation failures within the mmu code, so we preallocate
3914
+ * enough memory for a single page fault in a cache.
3916
struct kvm_mmu_memory_cache {
3918
void *objects[KVM_NR_MEM_OBJS];
3922
- * We don't want allocation failures within the mmu code, so we preallocate
3923
- * enough memory for a single page fault in a cache.
3925
struct kvm_guest_debug {
3927
unsigned long bp[4];
3932
- VCPU_REGS_RAX = 0,
3933
- VCPU_REGS_RCX = 1,
3934
- VCPU_REGS_RDX = 2,
3935
- VCPU_REGS_RBX = 3,
3936
- VCPU_REGS_RSP = 4,
3937
- VCPU_REGS_RBP = 5,
3938
- VCPU_REGS_RSI = 6,
3939
- VCPU_REGS_RDI = 7,
3940
-#ifdef CONFIG_X86_64
3943
- VCPU_REGS_R10 = 10,
3944
- VCPU_REGS_R11 = 11,
3945
- VCPU_REGS_R12 = 12,
3946
- VCPU_REGS_R13 = 13,
3947
- VCPU_REGS_R14 = 14,
3948
- VCPU_REGS_R15 = 15,
3964
struct kvm_pio_request {
3965
unsigned long count;
3967
@@ -219,7 +163,7 @@ struct kvm_pio_request {
3972
+struct kvm_vcpu_stat {
3976
@@ -234,8 +178,11 @@ struct kvm_stat {
3978
u32 request_irq_exits;
3981
+ u32 host_state_reload;
3984
+ u32 insn_emulation;
3985
+ u32 insn_emulation_fail;
3988
struct kvm_io_device {
3989
@@ -298,91 +245,37 @@ struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus *bus, gpa_t addr);
3990
void kvm_io_bus_register_dev(struct kvm_io_bus *bus,
3991
struct kvm_io_device *dev);
3995
- struct preempt_notifier preempt_notifier;
3997
- struct mutex mutex;
4000
- struct kvm_run *run;
4001
- int interrupt_window_open;
4003
- unsigned long requests;
4004
- unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
4005
- DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS);
4006
- unsigned long regs[NR_VCPU_REGS]; /* for rsp: vcpu_load_rsp_rip() */
4007
- unsigned long rip; /* needs vcpu_load_rsp_rip() */
4009
- unsigned long cr0;
4010
- unsigned long cr2;
4011
- unsigned long cr3;
4012
- gpa_t para_state_gpa;
4013
- struct page *para_state_page;
4014
- gpa_t hypercall_gpa;
4015
- unsigned long cr4;
4016
- unsigned long cr8;
4017
- u64 pdptrs[4]; /* pae */
4020
- struct kvm_lapic *apic; /* kernel irqchip context */
4021
-#define VCPU_MP_STATE_RUNNABLE 0
4022
-#define VCPU_MP_STATE_UNINITIALIZED 1
4023
-#define VCPU_MP_STATE_INIT_RECEIVED 2
4024
-#define VCPU_MP_STATE_SIPI_RECEIVED 3
4025
-#define VCPU_MP_STATE_HALTED 4
4028
- u64 ia32_misc_enable_msr;
4030
- struct kvm_mmu mmu;
4032
- struct kvm_mmu_memory_cache mmu_pte_chain_cache;
4033
- struct kvm_mmu_memory_cache mmu_rmap_desc_cache;
4034
- struct kvm_mmu_memory_cache mmu_page_cache;
4035
- struct kvm_mmu_memory_cache mmu_page_header_cache;
4037
- gfn_t last_pt_write_gfn;
4038
- int last_pt_write_count;
4040
- struct kvm_guest_debug guest_debug;
4042
- struct i387_fxsave_struct host_fx_image;
4043
- struct i387_fxsave_struct guest_fx_image;
4045
- int guest_fpu_loaded;
4048
- int mmio_read_completed;
4049
- int mmio_is_write;
4051
- unsigned char mmio_data[8];
4052
+#ifdef CONFIG_HAS_IOMEM
4053
+#define KVM_VCPU_MMIO \
4054
+ int mmio_needed; \
4055
+ int mmio_read_completed; \
4056
+ int mmio_is_write; \
4058
+ unsigned char mmio_data[8]; \
4059
gpa_t mmio_phys_addr;
4060
- gva_t mmio_fault_cr2;
4061
- struct kvm_pio_request pio;
4063
- wait_queue_head_t wq;
4065
- int sigset_active;
4068
+#define KVM_VCPU_MMIO
4070
- struct kvm_stat stat;
4076
- struct kvm_save_segment {
4078
- unsigned long base;
4081
- } tr, es, ds, fs, gs;
4083
- int halt_request; /* real mode on Intel only */
4086
- struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
4088
+#define KVM_VCPU_COMM \
4089
+ struct kvm *kvm; \
4090
+ struct preempt_notifier preempt_notifier; \
4092
+ struct mutex mutex; \
4094
+ struct kvm_run *run; \
4096
+ unsigned long requests; \
4097
+ struct kvm_guest_debug guest_debug; \
4099
+ int guest_fpu_loaded; \
4100
+ wait_queue_head_t wq; \
4101
+ int sigset_active; \
4102
+ sigset_t sigset; \
4103
+ struct kvm_vcpu_stat stat; \
4106
struct kvm_mem_alias {
4108
@@ -394,24 +287,39 @@ struct kvm_memory_slot {
4110
unsigned long npages;
4111
unsigned long flags;
4112
- struct page **phys_mem;
4113
+ unsigned long *rmap;
4114
unsigned long *dirty_bitmap;
4115
+ unsigned long userspace_addr;
4119
+struct kvm_vm_stat {
4120
+ u32 mmu_shadow_zapped;
4121
+ u32 mmu_pte_write;
4122
+ u32 mmu_pte_updated;
4123
+ u32 mmu_pde_zapped;
4126
+ u32 remote_tlb_flush;
4130
struct mutex lock; /* protects everything except vcpus */
4131
+ struct mm_struct *mm; /* userspace tied to this vm */
4133
struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
4135
- struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS];
4136
+ struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS +
4137
+ KVM_PRIVATE_MEM_SLOTS];
4139
* Hash table of struct kvm_mmu_page.
4141
struct list_head active_mmu_pages;
4142
- int n_free_mmu_pages;
4143
+ unsigned int n_free_mmu_pages;
4144
+ unsigned int n_requested_mmu_pages;
4145
+ unsigned int n_alloc_mmu_pages;
4146
struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
4147
struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
4148
- unsigned long rmap_overflow;
4149
struct list_head vm_list;
4151
struct kvm_io_bus mmio_bus;
4152
@@ -419,6 +327,9 @@ struct kvm {
4153
struct kvm_pic *vpic;
4154
struct kvm_ioapic *vioapic;
4155
int round_robin_prev_vcpu;
4156
+ unsigned int tss_addr;
4157
+ struct page *apic_access_page;
4158
+ struct kvm_vm_stat stat;
4161
static inline struct kvm_pic *pic_irqchip(struct kvm *kvm)
4162
@@ -433,7 +344,7 @@ static inline struct kvm_ioapic *ioapic_irqchip(struct kvm *kvm)
4164
static inline int irqchip_in_kernel(struct kvm *kvm)
4166
- return pic_irqchip(kvm) != 0;
4167
+ return pic_irqchip(kvm) != NULL;
4170
struct descriptor_table {
4171
@@ -441,80 +352,13 @@ struct descriptor_table {
4173
} __attribute__((packed));
4175
-struct kvm_x86_ops {
4176
- int (*cpu_has_kvm_support)(void); /* __init */
4177
- int (*disabled_by_bios)(void); /* __init */
4178
- void (*hardware_enable)(void *dummy); /* __init */
4179
- void (*hardware_disable)(void *dummy);
4180
- void (*check_processor_compatibility)(void *rtn);
4181
- int (*hardware_setup)(void); /* __init */
4182
- void (*hardware_unsetup)(void); /* __exit */
4184
- /* Create, but do not attach this VCPU */
4185
- struct kvm_vcpu *(*vcpu_create)(struct kvm *kvm, unsigned id);
4186
- void (*vcpu_free)(struct kvm_vcpu *vcpu);
4187
- void (*vcpu_reset)(struct kvm_vcpu *vcpu);
4189
- void (*prepare_guest_switch)(struct kvm_vcpu *vcpu);
4190
- void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
4191
- void (*vcpu_put)(struct kvm_vcpu *vcpu);
4192
- void (*vcpu_decache)(struct kvm_vcpu *vcpu);
4194
- int (*set_guest_debug)(struct kvm_vcpu *vcpu,
4195
- struct kvm_debug_guest *dbg);
4196
- void (*guest_debug_pre)(struct kvm_vcpu *vcpu);
4197
- int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
4198
- int (*set_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
4199
- u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg);
4200
- void (*get_segment)(struct kvm_vcpu *vcpu,
4201
- struct kvm_segment *var, int seg);
4202
- void (*set_segment)(struct kvm_vcpu *vcpu,
4203
- struct kvm_segment *var, int seg);
4204
- void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
4205
- void (*decache_cr4_guest_bits)(struct kvm_vcpu *vcpu);
4206
- void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
4207
- void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
4208
- void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
4209
- void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
4210
- void (*get_idt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
4211
- void (*set_idt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
4212
- void (*get_gdt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
4213
- void (*set_gdt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
4214
- unsigned long (*get_dr)(struct kvm_vcpu *vcpu, int dr);
4215
- void (*set_dr)(struct kvm_vcpu *vcpu, int dr, unsigned long value,
4217
- void (*cache_regs)(struct kvm_vcpu *vcpu);
4218
- void (*decache_regs)(struct kvm_vcpu *vcpu);
4219
- unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
4220
- void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
4222
- void (*tlb_flush)(struct kvm_vcpu *vcpu);
4223
- void (*inject_page_fault)(struct kvm_vcpu *vcpu,
4224
- unsigned long addr, u32 err_code);
4226
- void (*inject_gp)(struct kvm_vcpu *vcpu, unsigned err_code);
4228
- void (*run)(struct kvm_vcpu *vcpu, struct kvm_run *run);
4229
- int (*handle_exit)(struct kvm_run *run, struct kvm_vcpu *vcpu);
4230
- void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
4231
- void (*patch_hypercall)(struct kvm_vcpu *vcpu,
4232
- unsigned char *hypercall_addr);
4233
- int (*get_irq)(struct kvm_vcpu *vcpu);
4234
- void (*set_irq)(struct kvm_vcpu *vcpu, int vec);
4235
- void (*inject_pending_irq)(struct kvm_vcpu *vcpu);
4236
- void (*inject_pending_vectors)(struct kvm_vcpu *vcpu,
4237
- struct kvm_run *run);
4240
-extern struct kvm_x86_ops *kvm_x86_ops;
4242
/* The guest did something we don't support. */
4243
#define pr_unimpl(vcpu, fmt, ...) \
4245
if (printk_ratelimit()) \
4246
printk(KERN_ERR "kvm: %i: cpu%i " fmt, \
4247
current->tgid, (vcpu)->vcpu_id , ## __VA_ARGS__); \
4251
#define kvm_printf(kvm, fmt ...) printk(KERN_DEBUG fmt)
4252
#define vcpu_printf(vcpu, fmt...) kvm_printf(vcpu->kvm, fmt)
4253
@@ -522,275 +366,154 @@ extern struct kvm_x86_ops *kvm_x86_ops;
4254
int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id);
4255
void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
4257
-int kvm_init_x86(struct kvm_x86_ops *ops, unsigned int vcpu_size,
4258
- struct module *module);
4259
-void kvm_exit_x86(void);
4260
+void vcpu_load(struct kvm_vcpu *vcpu);
4261
+void vcpu_put(struct kvm_vcpu *vcpu);
4263
-int kvm_mmu_module_init(void);
4264
-void kvm_mmu_module_exit(void);
4265
+void decache_vcpus_on_cpu(int cpu);
4267
-void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
4268
-int kvm_mmu_create(struct kvm_vcpu *vcpu);
4269
-int kvm_mmu_setup(struct kvm_vcpu *vcpu);
4271
-int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
4272
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
4273
-void kvm_mmu_zap_all(struct kvm *kvm);
4274
+int kvm_init(void *opaque, unsigned int vcpu_size,
4275
+ struct module *module);
4276
+void kvm_exit(void);
4278
-hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
4279
#define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
4280
#define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
4281
static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
4282
-hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
4283
struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
4285
-extern hpa_t bad_page_address;
4287
+extern struct page *bad_page;
4289
+int is_error_page(struct page *page);
4290
+int kvm_is_error_hva(unsigned long addr);
4291
+int kvm_set_memory_region(struct kvm *kvm,
4292
+ struct kvm_userspace_memory_region *mem,
4294
+int __kvm_set_memory_region(struct kvm *kvm,
4295
+ struct kvm_userspace_memory_region *mem,
4297
+int kvm_arch_set_memory_region(struct kvm *kvm,
4298
+ struct kvm_userspace_memory_region *mem,
4299
+ struct kvm_memory_slot old,
4301
+gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
4302
struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
4303
+void kvm_release_page_clean(struct page *page);
4304
+void kvm_release_page_dirty(struct page *page);
4305
+int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
4307
+int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len);
4308
+int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
4309
+ int offset, int len);
4310
+int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
4311
+ unsigned long len);
4312
+int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
4313
+int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
4314
struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
4315
+int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
4316
void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
4318
-enum emulation_result {
4319
- EMULATE_DONE, /* no further processing */
4320
- EMULATE_DO_MMIO, /* kvm_run filled with mmio request */
4321
- EMULATE_FAIL, /* can't emulate this instruction */
4324
-int emulate_instruction(struct kvm_vcpu *vcpu, struct kvm_run *run,
4325
- unsigned long cr2, u16 error_code);
4326
-void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char *context);
4327
-void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
4328
-void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
4329
-void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
4330
- unsigned long *rflags);
4332
-unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr);
4333
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
4334
- unsigned long *rflags);
4335
-int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data);
4336
-int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
4338
-struct x86_emulate_ctxt;
4340
-int kvm_emulate_pio (struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
4341
- int size, unsigned port);
4342
-int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
4343
- int size, unsigned long count, int down,
4344
- gva_t address, int rep, unsigned port);
4345
-void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
4346
-int kvm_emulate_halt(struct kvm_vcpu *vcpu);
4347
-int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
4348
-int emulate_clts(struct kvm_vcpu *vcpu);
4349
-int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr,
4350
- unsigned long *dest);
4351
-int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr,
4352
- unsigned long value);
4354
-void set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
4355
-void set_cr3(struct kvm_vcpu *vcpu, unsigned long cr0);
4356
-void set_cr4(struct kvm_vcpu *vcpu, unsigned long cr0);
4357
-void set_cr8(struct kvm_vcpu *vcpu, unsigned long cr0);
4358
-unsigned long get_cr8(struct kvm_vcpu *vcpu);
4359
-void lmsw(struct kvm_vcpu *vcpu, unsigned long msw);
4360
-void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l);
4362
-int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
4363
-int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data);
4365
-void fx_init(struct kvm_vcpu *vcpu);
4367
+void kvm_vcpu_block(struct kvm_vcpu *vcpu);
4368
void kvm_resched(struct kvm_vcpu *vcpu);
4369
void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
4370
void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
4371
void kvm_flush_remote_tlbs(struct kvm *kvm);
4373
-int emulator_read_std(unsigned long addr,
4375
- unsigned int bytes,
4376
- struct kvm_vcpu *vcpu);
4377
-int emulator_write_emulated(unsigned long addr,
4379
- unsigned int bytes,
4380
- struct kvm_vcpu *vcpu);
4382
-unsigned long segment_base(u16 selector);
4384
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
4385
- const u8 *new, int bytes);
4386
-int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
4387
-void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
4388
-int kvm_mmu_load(struct kvm_vcpu *vcpu);
4389
-void kvm_mmu_unload(struct kvm_vcpu *vcpu);
4391
-int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run);
4392
+long kvm_arch_dev_ioctl(struct file *filp,
4393
+ unsigned int ioctl, unsigned long arg);
4394
+long kvm_arch_vcpu_ioctl(struct file *filp,
4395
+ unsigned int ioctl, unsigned long arg);
4396
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
4397
+void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
4399
+int kvm_dev_ioctl_check_extension(long ext);
4401
+int kvm_get_dirty_log(struct kvm *kvm,
4402
+ struct kvm_dirty_log *log, int *is_dirty);
4403
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
4404
+ struct kvm_dirty_log *log);
4406
+int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
4408
+ kvm_userspace_memory_region *mem,
4410
+long kvm_arch_vm_ioctl(struct file *filp,
4411
+ unsigned int ioctl, unsigned long arg);
4412
+void kvm_arch_destroy_vm(struct kvm *kvm);
4414
+int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu);
4415
+int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu);
4417
+int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
4418
+ struct kvm_translation *tr);
4420
+int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs);
4421
+int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs);
4422
+int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
4423
+ struct kvm_sregs *sregs);
4424
+int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
4425
+ struct kvm_sregs *sregs);
4426
+int kvm_arch_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu,
4427
+ struct kvm_debug_guest *dbg);
4428
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
4430
+int kvm_arch_init(void *opaque);
4431
+void kvm_arch_exit(void);
4433
+int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu);
4434
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
4436
+void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu);
4437
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
4438
+void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
4439
+struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id);
4440
+int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu);
4441
+void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu);
4443
+int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu);
4444
+void kvm_arch_hardware_enable(void *garbage);
4445
+void kvm_arch_hardware_disable(void *garbage);
4446
+int kvm_arch_hardware_setup(void);
4447
+void kvm_arch_hardware_unsetup(void);
4448
+void kvm_arch_check_processor_compat(void *rtn);
4450
+void kvm_free_physmem(struct kvm *kvm);
4452
+struct kvm *kvm_arch_create_vm(void);
4453
+void kvm_arch_destroy_vm(struct kvm *kvm);
4455
static inline void kvm_guest_enter(void)
4457
+ account_system_vtime(current);
4458
current->flags |= PF_VCPU;
4461
static inline void kvm_guest_exit(void)
4463
+ account_system_vtime(current);
4464
current->flags &= ~PF_VCPU;
4467
-static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
4470
- return vcpu->mmu.page_fault(vcpu, gva, error_code);
4473
-static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
4475
- if (unlikely(vcpu->kvm->n_free_mmu_pages < KVM_MIN_FREE_MMU_PAGES))
4476
- __kvm_mmu_free_some_pages(vcpu);
4479
-static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
4481
- if (likely(vcpu->mmu.root_hpa != INVALID_PAGE))
4484
- return kvm_mmu_load(vcpu);
4487
-static inline int is_long_mode(struct kvm_vcpu *vcpu)
4489
-#ifdef CONFIG_X86_64
4490
- return vcpu->shadow_efer & EFER_LME;
4496
-static inline int is_pae(struct kvm_vcpu *vcpu)
4498
- return vcpu->cr4 & X86_CR4_PAE;
4501
-static inline int is_pse(struct kvm_vcpu *vcpu)
4503
- return vcpu->cr4 & X86_CR4_PSE;
4506
-static inline int is_paging(struct kvm_vcpu *vcpu)
4508
- return vcpu->cr0 & X86_CR0_PG;
4511
static inline int memslot_id(struct kvm *kvm, struct kvm_memory_slot *slot)
4513
return slot - kvm->memslots;
4516
-static inline struct kvm_mmu_page *page_header(hpa_t shadow_page)
4518
- struct page *page = pfn_to_page(shadow_page >> PAGE_SHIFT);
4520
- return (struct kvm_mmu_page *)page_private(page);
4523
-static inline u16 read_fs(void)
4526
- asm ("mov %%fs, %0" : "=g"(seg));
4530
-static inline u16 read_gs(void)
4533
- asm ("mov %%gs, %0" : "=g"(seg));
4537
-static inline u16 read_ldt(void)
4540
- asm ("sldt %0" : "=g"(ldt));
4544
-static inline void load_fs(u16 sel)
4546
- asm ("mov %0, %%fs" : : "rm"(sel));
4549
-static inline void load_gs(u16 sel)
4551
- asm ("mov %0, %%gs" : : "rm"(sel));
4555
-static inline void load_ldt(u16 sel)
4557
- asm ("lldt %0" : : "rm"(sel));
4561
-static inline void get_idt(struct descriptor_table *table)
4563
- asm ("sidt %0" : "=m"(*table));
4566
-static inline void get_gdt(struct descriptor_table *table)
4568
- asm ("sgdt %0" : "=m"(*table));
4571
-static inline unsigned long read_tr_base(void)
4572
+static inline gpa_t gfn_to_gpa(gfn_t gfn)
4575
- asm ("str %0" : "=g"(tr));
4576
- return segment_base(tr);
4577
+ return (gpa_t)gfn << PAGE_SHIFT;
4580
-#ifdef CONFIG_X86_64
4581
-static inline unsigned long read_msr(unsigned long msr)
4585
- rdmsrl(msr, value);
4590
-static inline void fx_save(struct i387_fxsave_struct *image)
4592
- asm ("fxsave (%0)":: "r" (image));
4595
-static inline void fx_restore(struct i387_fxsave_struct *image)
4597
- asm ("fxrstor (%0)":: "r" (image));
4600
-static inline void fpu_init(void)
4605
-static inline u32 get_rdx_init_val(void)
4607
- return 0x600; /* P6 family */
4609
+enum kvm_stat_kind {
4614
-#define ASM_VMX_VMCLEAR_RAX ".byte 0x66, 0x0f, 0xc7, 0x30"
4615
-#define ASM_VMX_VMLAUNCH ".byte 0x0f, 0x01, 0xc2"
4616
-#define ASM_VMX_VMRESUME ".byte 0x0f, 0x01, 0xc3"
4617
-#define ASM_VMX_VMPTRLD_RAX ".byte 0x0f, 0xc7, 0x30"
4618
-#define ASM_VMX_VMREAD_RDX_RAX ".byte 0x0f, 0x78, 0xd0"
4619
-#define ASM_VMX_VMWRITE_RAX_RDX ".byte 0x0f, 0x79, 0xd0"
4620
-#define ASM_VMX_VMWRITE_RSP_RDX ".byte 0x0f, 0x79, 0xd4"
4621
-#define ASM_VMX_VMXOFF ".byte 0x0f, 0x01, 0xc4"
4622
-#define ASM_VMX_VMXON_RAX ".byte 0xf3, 0x0f, 0xc7, 0x30"
4624
-#define MSR_IA32_TIME_STAMP_COUNTER 0x010
4626
-#define TSS_IOPB_BASE_OFFSET 0x66
4627
-#define TSS_BASE_SIZE 0x68
4628
-#define TSS_IOPB_SIZE (65536 / 8)
4629
-#define TSS_REDIRECTION_SIZE (256 / 8)
4630
-#define RMODE_TSS_SIZE (TSS_BASE_SIZE + TSS_REDIRECTION_SIZE + TSS_IOPB_SIZE + 1)
4631
+struct kvm_stats_debugfs_item {
4634
+ enum kvm_stat_kind kind;
4635
+ struct dentry *dentry;
4637
+extern struct kvm_stats_debugfs_item debugfs_entries[];
4640
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
4641
index 47c10b8..7b5129e 100644
4642
--- a/drivers/kvm/kvm_main.c
4643
+++ b/drivers/kvm/kvm_main.c
4648
-#include "x86_emulate.h"
4649
-#include "segment_descriptor.h"
4653
#include <linux/kvm.h>
4654
@@ -39,158 +38,57 @@
4655
#include <linux/smp.h>
4656
#include <linux/anon_inodes.h>
4657
#include <linux/profile.h>
4658
+#include <linux/kvm_para.h>
4659
+#include <linux/pagemap.h>
4660
+#include <linux/mman.h>
4662
#include <asm/processor.h>
4663
-#include <asm/msr.h>
4665
#include <asm/uaccess.h>
4666
#include <asm/desc.h>
4667
+#include <asm/pgtable.h>
4669
MODULE_AUTHOR("Qumranet");
4670
MODULE_LICENSE("GPL");
4672
-static DEFINE_SPINLOCK(kvm_lock);
4673
-static LIST_HEAD(vm_list);
4674
+DEFINE_SPINLOCK(kvm_lock);
4675
+LIST_HEAD(vm_list);
4677
static cpumask_t cpus_hardware_enabled;
4679
-struct kvm_x86_ops *kvm_x86_ops;
4680
struct kmem_cache *kvm_vcpu_cache;
4681
EXPORT_SYMBOL_GPL(kvm_vcpu_cache);
4683
static __read_mostly struct preempt_ops kvm_preempt_ops;
4685
-#define STAT_OFFSET(x) offsetof(struct kvm_vcpu, stat.x)
4687
-static struct kvm_stats_debugfs_item {
4690
- struct dentry *dentry;
4691
-} debugfs_entries[] = {
4692
- { "pf_fixed", STAT_OFFSET(pf_fixed) },
4693
- { "pf_guest", STAT_OFFSET(pf_guest) },
4694
- { "tlb_flush", STAT_OFFSET(tlb_flush) },
4695
- { "invlpg", STAT_OFFSET(invlpg) },
4696
- { "exits", STAT_OFFSET(exits) },
4697
- { "io_exits", STAT_OFFSET(io_exits) },
4698
- { "mmio_exits", STAT_OFFSET(mmio_exits) },
4699
- { "signal_exits", STAT_OFFSET(signal_exits) },
4700
- { "irq_window", STAT_OFFSET(irq_window_exits) },
4701
- { "halt_exits", STAT_OFFSET(halt_exits) },
4702
- { "halt_wakeup", STAT_OFFSET(halt_wakeup) },
4703
- { "request_irq", STAT_OFFSET(request_irq_exits) },
4704
- { "irq_exits", STAT_OFFSET(irq_exits) },
4705
- { "light_exits", STAT_OFFSET(light_exits) },
4706
- { "efer_reload", STAT_OFFSET(efer_reload) },
4710
static struct dentry *debugfs_dir;
4712
-#define MAX_IO_MSRS 256
4714
-#define CR0_RESERVED_BITS \
4715
- (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
4716
- | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
4717
- | X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))
4718
-#define CR4_RESERVED_BITS \
4719
- (~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
4720
- | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
4721
- | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR \
4722
- | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
4724
-#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
4725
-#define EFER_RESERVED_BITS 0xfffffffffffff2fe
4727
-#ifdef CONFIG_X86_64
4728
-// LDT or TSS descriptor in the GDT. 16 bytes.
4729
-struct segment_descriptor_64 {
4730
- struct segment_descriptor s;
4737
static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
4740
-unsigned long segment_base(u16 selector)
4742
- struct descriptor_table gdt;
4743
- struct segment_descriptor *d;
4744
- unsigned long table_base;
4745
- typedef unsigned long ul;
4748
- if (selector == 0)
4751
- asm ("sgdt %0" : "=m"(gdt));
4752
- table_base = gdt.base;
4754
- if (selector & 4) { /* from ldt */
4757
- asm ("sldt %0" : "=g"(ldt_selector));
4758
- table_base = segment_base(ldt_selector);
4760
- d = (struct segment_descriptor *)(table_base + (selector & ~7));
4761
- v = d->base_low | ((ul)d->base_mid << 16) | ((ul)d->base_high << 24);
4762
-#ifdef CONFIG_X86_64
4763
- if (d->system == 0
4764
- && (d->type == 2 || d->type == 9 || d->type == 11))
4765
- v |= ((ul)((struct segment_descriptor_64 *)d)->base_higher) << 32;
4769
-EXPORT_SYMBOL_GPL(segment_base);
4771
static inline int valid_vcpu(int n)
4773
return likely(n >= 0 && n < KVM_MAX_VCPUS);
4776
-void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
4778
- if (!vcpu->fpu_active || vcpu->guest_fpu_loaded)
4781
- vcpu->guest_fpu_loaded = 1;
4782
- fx_save(&vcpu->host_fx_image);
4783
- fx_restore(&vcpu->guest_fx_image);
4785
-EXPORT_SYMBOL_GPL(kvm_load_guest_fpu);
4787
-void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
4789
- if (!vcpu->guest_fpu_loaded)
4792
- vcpu->guest_fpu_loaded = 0;
4793
- fx_save(&vcpu->guest_fx_image);
4794
- fx_restore(&vcpu->host_fx_image);
4796
-EXPORT_SYMBOL_GPL(kvm_put_guest_fpu);
4799
* Switches to specified vcpu, until a matching vcpu_put()
4801
-static void vcpu_load(struct kvm_vcpu *vcpu)
4802
+void vcpu_load(struct kvm_vcpu *vcpu)
4806
mutex_lock(&vcpu->mutex);
4808
preempt_notifier_register(&vcpu->preempt_notifier);
4809
- kvm_x86_ops->vcpu_load(vcpu, cpu);
4810
+ kvm_arch_vcpu_load(vcpu, cpu);
4814
-static void vcpu_put(struct kvm_vcpu *vcpu)
4815
+void vcpu_put(struct kvm_vcpu *vcpu)
4818
- kvm_x86_ops->vcpu_put(vcpu);
4819
+ kvm_arch_vcpu_put(vcpu);
4820
preempt_notifier_unregister(&vcpu->preempt_notifier);
4822
mutex_unlock(&vcpu->mutex);
4823
@@ -211,12 +109,15 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
4824
vcpu = kvm->vcpus[i];
4827
- if (test_and_set_bit(KVM_TLB_FLUSH, &vcpu->requests))
4828
+ if (test_and_set_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests))
4831
if (cpu != -1 && cpu != raw_smp_processor_id())
4834
+ if (cpus_empty(cpus))
4836
+ ++kvm->stat.remote_tlb_flush;
4837
smp_call_function_mask(cpus, ack_flush, NULL, 1);
4840
@@ -227,13 +128,8 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
4842
mutex_init(&vcpu->mutex);
4844
- vcpu->mmu.root_hpa = INVALID_PAGE;
4847
- if (!irqchip_in_kernel(kvm) || id == 0)
4848
- vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
4850
- vcpu->mp_state = VCPU_MP_STATE_UNINITIALIZED;
4851
init_waitqueue_head(&vcpu->wq);
4853
page = alloc_page(GFP_KERNEL | __GFP_ZERO);
4854
@@ -243,53 +139,41 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
4856
vcpu->run = page_address(page);
4858
- page = alloc_page(GFP_KERNEL | __GFP_ZERO);
4861
- goto fail_free_run;
4863
- vcpu->pio_data = page_address(page);
4865
- r = kvm_mmu_create(vcpu);
4866
+ r = kvm_arch_vcpu_init(vcpu);
4868
- goto fail_free_pio_data;
4870
+ goto fail_free_run;
4873
-fail_free_pio_data:
4874
- free_page((unsigned long)vcpu->pio_data);
4876
free_page((unsigned long)vcpu->run);
4881
EXPORT_SYMBOL_GPL(kvm_vcpu_init);
4883
void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
4885
- kvm_mmu_destroy(vcpu);
4887
- hrtimer_cancel(&vcpu->apic->timer.dev);
4888
- kvm_free_apic(vcpu->apic);
4889
- free_page((unsigned long)vcpu->pio_data);
4890
+ kvm_arch_vcpu_uninit(vcpu);
4891
free_page((unsigned long)vcpu->run);
4893
EXPORT_SYMBOL_GPL(kvm_vcpu_uninit);
4895
static struct kvm *kvm_create_vm(void)
4897
- struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
4898
+ struct kvm *kvm = kvm_arch_create_vm();
4901
- return ERR_PTR(-ENOMEM);
4905
+ kvm->mm = current->mm;
4906
+ atomic_inc(&kvm->mm->mm_count);
4907
kvm_io_bus_init(&kvm->pio_bus);
4908
mutex_init(&kvm->lock);
4909
- INIT_LIST_HEAD(&kvm->active_mmu_pages);
4910
kvm_io_bus_init(&kvm->mmio_bus);
4911
spin_lock(&kvm_lock);
4912
list_add(&kvm->vm_list, &vm_list);
4913
spin_unlock(&kvm_lock);
4918
@@ -299,25 +183,18 @@ static struct kvm *kvm_create_vm(void)
4919
static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
4920
struct kvm_memory_slot *dont)
4924
- if (!dont || free->phys_mem != dont->phys_mem)
4925
- if (free->phys_mem) {
4926
- for (i = 0; i < free->npages; ++i)
4927
- if (free->phys_mem[i])
4928
- __free_page(free->phys_mem[i]);
4929
- vfree(free->phys_mem);
4931
+ if (!dont || free->rmap != dont->rmap)
4932
+ vfree(free->rmap);
4934
if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
4935
vfree(free->dirty_bitmap);
4937
- free->phys_mem = NULL;
4939
free->dirty_bitmap = NULL;
4940
+ free->rmap = NULL;
4943
-static void kvm_free_physmem(struct kvm *kvm)
4944
+void kvm_free_physmem(struct kvm *kvm)
4948
@@ -325,55 +202,17 @@ static void kvm_free_physmem(struct kvm *kvm)
4949
kvm_free_physmem_slot(&kvm->memslots[i], NULL);
4952
-static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
4956
- for (i = 0; i < ARRAY_SIZE(vcpu->pio.guest_pages); ++i)
4957
- if (vcpu->pio.guest_pages[i]) {
4958
- __free_page(vcpu->pio.guest_pages[i]);
4959
- vcpu->pio.guest_pages[i] = NULL;
4963
-static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
4966
- kvm_mmu_unload(vcpu);
4970
-static void kvm_free_vcpus(struct kvm *kvm)
4975
- * Unpin any mmu pages first.
4977
- for (i = 0; i < KVM_MAX_VCPUS; ++i)
4978
- if (kvm->vcpus[i])
4979
- kvm_unload_vcpu_mmu(kvm->vcpus[i]);
4980
- for (i = 0; i < KVM_MAX_VCPUS; ++i) {
4981
- if (kvm->vcpus[i]) {
4982
- kvm_x86_ops->vcpu_free(kvm->vcpus[i]);
4983
- kvm->vcpus[i] = NULL;
4989
static void kvm_destroy_vm(struct kvm *kvm)
4991
+ struct mm_struct *mm = kvm->mm;
4993
spin_lock(&kvm_lock);
4994
list_del(&kvm->vm_list);
4995
spin_unlock(&kvm_lock);
4996
kvm_io_bus_destroy(&kvm->pio_bus);
4997
kvm_io_bus_destroy(&kvm->mmio_bus);
4999
- kfree(kvm->vioapic);
5000
- kvm_free_vcpus(kvm);
5001
- kvm_free_physmem(kvm);
5003
+ kvm_arch_destroy_vm(kvm);
5007
static int kvm_vm_release(struct inode *inode, struct file *filp)
5008
@@ -384,275 +223,17 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
5012
-static void inject_gp(struct kvm_vcpu *vcpu)
5014
- kvm_x86_ops->inject_gp(vcpu, 0);
5018
- * Load the pae pdptrs. Return true is they are all valid.
5020
-static int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
5022
- gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
5023
- unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
5027
- struct page *page;
5028
- u64 pdpte[ARRAY_SIZE(vcpu->pdptrs)];
5030
- mutex_lock(&vcpu->kvm->lock);
5031
- page = gfn_to_page(vcpu->kvm, pdpt_gfn);
5037
- pdpt = kmap_atomic(page, KM_USER0);
5038
- memcpy(pdpte, pdpt+offset, sizeof(pdpte));
5039
- kunmap_atomic(pdpt, KM_USER0);
5041
- for (i = 0; i < ARRAY_SIZE(pdpte); ++i) {
5042
- if ((pdpte[i] & 1) && (pdpte[i] & 0xfffffff0000001e6ull)) {
5049
- memcpy(vcpu->pdptrs, pdpte, sizeof(vcpu->pdptrs));
5051
- mutex_unlock(&vcpu->kvm->lock);
5056
-void set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
5058
- if (cr0 & CR0_RESERVED_BITS) {
5059
- printk(KERN_DEBUG "set_cr0: 0x%lx #GP, reserved bits 0x%lx\n",
5065
- if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) {
5066
- printk(KERN_DEBUG "set_cr0: #GP, CD == 0 && NW == 1\n");
5071
- if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) {
5072
- printk(KERN_DEBUG "set_cr0: #GP, set PG flag "
5073
- "and a clear PE flag\n");
5078
- if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
5079
-#ifdef CONFIG_X86_64
5080
- if ((vcpu->shadow_efer & EFER_LME)) {
5083
- if (!is_pae(vcpu)) {
5084
- printk(KERN_DEBUG "set_cr0: #GP, start paging "
5085
- "in long mode while PAE is disabled\n");
5089
- kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
5091
- printk(KERN_DEBUG "set_cr0: #GP, start paging "
5092
- "in long mode while CS.L == 1\n");
5099
- if (is_pae(vcpu) && !load_pdptrs(vcpu, vcpu->cr3)) {
5100
- printk(KERN_DEBUG "set_cr0: #GP, pdptrs "
5101
- "reserved bits\n");
5108
- kvm_x86_ops->set_cr0(vcpu, cr0);
5111
- mutex_lock(&vcpu->kvm->lock);
5112
- kvm_mmu_reset_context(vcpu);
5113
- mutex_unlock(&vcpu->kvm->lock);
5116
-EXPORT_SYMBOL_GPL(set_cr0);
5118
-void lmsw(struct kvm_vcpu *vcpu, unsigned long msw)
5120
- set_cr0(vcpu, (vcpu->cr0 & ~0x0ful) | (msw & 0x0f));
5122
-EXPORT_SYMBOL_GPL(lmsw);
5124
-void set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
5126
- if (cr4 & CR4_RESERVED_BITS) {
5127
- printk(KERN_DEBUG "set_cr4: #GP, reserved bits\n");
5132
- if (is_long_mode(vcpu)) {
5133
- if (!(cr4 & X86_CR4_PAE)) {
5134
- printk(KERN_DEBUG "set_cr4: #GP, clearing PAE while "
5135
- "in long mode\n");
5139
- } else if (is_paging(vcpu) && !is_pae(vcpu) && (cr4 & X86_CR4_PAE)
5140
- && !load_pdptrs(vcpu, vcpu->cr3)) {
5141
- printk(KERN_DEBUG "set_cr4: #GP, pdptrs reserved bits\n");
5146
- if (cr4 & X86_CR4_VMXE) {
5147
- printk(KERN_DEBUG "set_cr4: #GP, setting VMXE\n");
5151
- kvm_x86_ops->set_cr4(vcpu, cr4);
5153
- mutex_lock(&vcpu->kvm->lock);
5154
- kvm_mmu_reset_context(vcpu);
5155
- mutex_unlock(&vcpu->kvm->lock);
5157
-EXPORT_SYMBOL_GPL(set_cr4);
5159
-void set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
5161
- if (is_long_mode(vcpu)) {
5162
- if (cr3 & CR3_L_MODE_RESERVED_BITS) {
5163
- printk(KERN_DEBUG "set_cr3: #GP, reserved bits\n");
5168
- if (is_pae(vcpu)) {
5169
- if (cr3 & CR3_PAE_RESERVED_BITS) {
5171
- "set_cr3: #GP, reserved bits\n");
5175
- if (is_paging(vcpu) && !load_pdptrs(vcpu, cr3)) {
5176
- printk(KERN_DEBUG "set_cr3: #GP, pdptrs "
5177
- "reserved bits\n");
5182
- if (cr3 & CR3_NONPAE_RESERVED_BITS) {
5184
- "set_cr3: #GP, reserved bits\n");
5191
- mutex_lock(&vcpu->kvm->lock);
5193
- * Does the new cr3 value map to physical memory? (Note, we
5194
- * catch an invalid cr3 even in real-mode, because it would
5195
- * cause trouble later on when we turn on paging anyway.)
5197
- * A real CPU would silently accept an invalid cr3 and would
5198
- * attempt to use it - with largely undefined (and often hard
5199
- * to debug) behavior on the guest side.
5201
- if (unlikely(!gfn_to_memslot(vcpu->kvm, cr3 >> PAGE_SHIFT)))
5205
- vcpu->mmu.new_cr3(vcpu);
5207
- mutex_unlock(&vcpu->kvm->lock);
5209
-EXPORT_SYMBOL_GPL(set_cr3);
5211
-void set_cr8(struct kvm_vcpu *vcpu, unsigned long cr8)
5213
- if (cr8 & CR8_RESERVED_BITS) {
5214
- printk(KERN_DEBUG "set_cr8: #GP, reserved bits 0x%lx\n", cr8);
5218
- if (irqchip_in_kernel(vcpu->kvm))
5219
- kvm_lapic_set_tpr(vcpu, cr8);
5223
-EXPORT_SYMBOL_GPL(set_cr8);
5225
-unsigned long get_cr8(struct kvm_vcpu *vcpu)
5227
- if (irqchip_in_kernel(vcpu->kvm))
5228
- return kvm_lapic_get_cr8(vcpu);
5232
-EXPORT_SYMBOL_GPL(get_cr8);
5234
-u64 kvm_get_apic_base(struct kvm_vcpu *vcpu)
5236
- if (irqchip_in_kernel(vcpu->kvm))
5237
- return vcpu->apic_base;
5239
- return vcpu->apic_base;
5241
-EXPORT_SYMBOL_GPL(kvm_get_apic_base);
5243
-void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data)
5245
- /* TODO: reserve bits check */
5246
- if (irqchip_in_kernel(vcpu->kvm))
5247
- kvm_lapic_set_base(vcpu, data);
5249
- vcpu->apic_base = data;
5251
-EXPORT_SYMBOL_GPL(kvm_set_apic_base);
5253
-void fx_init(struct kvm_vcpu *vcpu)
5255
- unsigned after_mxcsr_mask;
5257
- /* Initialize guest FPU by resetting ours and saving into guest's */
5258
- preempt_disable();
5259
- fx_save(&vcpu->host_fx_image);
5261
- fx_save(&vcpu->guest_fx_image);
5262
- fx_restore(&vcpu->host_fx_image);
5265
- vcpu->cr0 |= X86_CR0_ET;
5266
- after_mxcsr_mask = offsetof(struct i387_fxsave_struct, st_space);
5267
- vcpu->guest_fx_image.mxcsr = 0x1f80;
5268
- memset((void *)&vcpu->guest_fx_image + after_mxcsr_mask,
5269
- 0, sizeof(struct i387_fxsave_struct) - after_mxcsr_mask);
5271
-EXPORT_SYMBOL_GPL(fx_init);
5274
* Allocate some memory and give it an address in the guest physical address
5277
* Discontiguous memory is allowed, mostly for framebuffers.
5279
+ * Must be called holding kvm->lock.
5281
-static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5282
- struct kvm_memory_region *mem)
5283
+int __kvm_set_memory_region(struct kvm *kvm,
5284
+ struct kvm_userspace_memory_region *mem,
5289
@@ -667,7 +248,7 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5291
if (mem->guest_phys_addr & (PAGE_SIZE - 1))
5293
- if (mem->slot >= KVM_MEMORY_SLOTS)
5294
+ if (mem->slot >= KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
5296
if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)
5298
@@ -679,8 +260,6 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5300
mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
5302
- mutex_lock(&kvm->lock);
5304
new = old = *memslot;
5306
new.base_gfn = base_gfn;
5307
@@ -690,7 +269,7 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5308
/* Disallow changing a memory slot's size. */
5310
if (npages && old.npages && npages != old.npages)
5314
/* Check for overlaps */
5316
@@ -701,13 +280,9 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5318
if (!((base_gfn + npages <= s->base_gfn) ||
5319
(base_gfn >= s->base_gfn + s->npages)))
5324
- /* Deallocate if slot is being removed */
5326
- new.phys_mem = NULL;
5328
/* Free page dirty bitmap if unneeded */
5329
if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
5330
new.dirty_bitmap = NULL;
5331
@@ -715,20 +290,16 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5334
/* Allocate if a slot is being created */
5335
- if (npages && !new.phys_mem) {
5336
- new.phys_mem = vmalloc(npages * sizeof(struct page *));
5338
- if (!new.phys_mem)
5341
- memset(new.phys_mem, 0, npages * sizeof(struct page *));
5342
- for (i = 0; i < npages; ++i) {
5343
- new.phys_mem[i] = alloc_page(GFP_HIGHUSER
5345
- if (!new.phys_mem[i])
5347
- set_page_private(new.phys_mem[i],0);
5349
+ if (npages && !new.rmap) {
5350
+ new.rmap = vmalloc(npages * sizeof(struct page *));
5355
+ memset(new.rmap, 0, npages * sizeof(*new.rmap));
5357
+ new.user_alloc = user_alloc;
5358
+ new.userspace_addr = mem->userspace_addr;
5361
/* Allocate page dirty bitmap if needed */
5362
@@ -737,7 +308,7 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5364
new.dirty_bitmap = vmalloc(dirty_bytes);
5365
if (!new.dirty_bitmap)
5368
memset(new.dirty_bitmap, 0, dirty_bytes);
5371
@@ -746,34 +317,54 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5375
- kvm_mmu_slot_remove_write_access(kvm, mem->slot);
5376
- kvm_flush_remote_tlbs(kvm);
5378
- mutex_unlock(&kvm->lock);
5379
+ r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
5385
kvm_free_physmem_slot(&old, &new);
5389
- mutex_unlock(&kvm->lock);
5391
kvm_free_physmem_slot(&new, &old);
5396
+EXPORT_SYMBOL_GPL(__kvm_set_memory_region);
5399
- * Get (and clear) the dirty memory log for a memory slot.
5401
-static int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
5402
- struct kvm_dirty_log *log)
5403
+int kvm_set_memory_region(struct kvm *kvm,
5404
+ struct kvm_userspace_memory_region *mem,
5409
+ mutex_lock(&kvm->lock);
5410
+ r = __kvm_set_memory_region(kvm, mem, user_alloc);
5411
+ mutex_unlock(&kvm->lock);
5414
+EXPORT_SYMBOL_GPL(kvm_set_memory_region);
5416
+int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
5418
+ kvm_userspace_memory_region *mem,
5421
+ if (mem->slot >= KVM_MEMORY_SLOTS)
5423
+ return kvm_set_memory_region(kvm, mem, user_alloc);
5426
+int kvm_get_dirty_log(struct kvm *kvm,
5427
+ struct kvm_dirty_log *log, int *is_dirty)
5429
struct kvm_memory_slot *memslot;
5432
unsigned long any = 0;
5434
- mutex_lock(&kvm->lock);
5437
if (log->slot >= KVM_MEMORY_SLOTS)
5439
@@ -792,138 +383,30 @@ static int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
5440
if (copy_to_user(log->dirty_bitmap, memslot->dirty_bitmap, n))
5443
- /* If nothing is dirty, don't bother messing with page tables. */
5445
- kvm_mmu_slot_remove_write_access(kvm, log->slot);
5446
- kvm_flush_remote_tlbs(kvm);
5447
- memset(memslot->dirty_bitmap, 0, n);
5455
- mutex_unlock(&kvm->lock);
5460
- * Set a new alias region. Aliases map a portion of physical memory into
5461
- * another portion. This is useful for memory windows, for example the PC
5464
-static int kvm_vm_ioctl_set_memory_alias(struct kvm *kvm,
5465
- struct kvm_memory_alias *alias)
5468
- struct kvm_mem_alias *p;
5471
- /* General sanity checks */
5472
- if (alias->memory_size & (PAGE_SIZE - 1))
5474
- if (alias->guest_phys_addr & (PAGE_SIZE - 1))
5476
- if (alias->slot >= KVM_ALIAS_SLOTS)
5478
- if (alias->guest_phys_addr + alias->memory_size
5479
- < alias->guest_phys_addr)
5481
- if (alias->target_phys_addr + alias->memory_size
5482
- < alias->target_phys_addr)
5485
- mutex_lock(&kvm->lock);
5487
- p = &kvm->aliases[alias->slot];
5488
- p->base_gfn = alias->guest_phys_addr >> PAGE_SHIFT;
5489
- p->npages = alias->memory_size >> PAGE_SHIFT;
5490
- p->target_gfn = alias->target_phys_addr >> PAGE_SHIFT;
5492
- for (n = KVM_ALIAS_SLOTS; n > 0; --n)
5493
- if (kvm->aliases[n - 1].npages)
5495
- kvm->naliases = n;
5497
- kvm_mmu_zap_all(kvm);
5499
- mutex_unlock(&kvm->lock);
5507
-static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
5508
+int is_error_page(struct page *page)
5513
- switch (chip->chip_id) {
5514
- case KVM_IRQCHIP_PIC_MASTER:
5515
- memcpy (&chip->chip.pic,
5516
- &pic_irqchip(kvm)->pics[0],
5517
- sizeof(struct kvm_pic_state));
5519
- case KVM_IRQCHIP_PIC_SLAVE:
5520
- memcpy (&chip->chip.pic,
5521
- &pic_irqchip(kvm)->pics[1],
5522
- sizeof(struct kvm_pic_state));
5524
- case KVM_IRQCHIP_IOAPIC:
5525
- memcpy (&chip->chip.ioapic,
5526
- ioapic_irqchip(kvm),
5527
- sizeof(struct kvm_ioapic_state));
5534
+ return page == bad_page;
5536
+EXPORT_SYMBOL_GPL(is_error_page);
5538
-static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
5539
+static inline unsigned long bad_hva(void)
5544
- switch (chip->chip_id) {
5545
- case KVM_IRQCHIP_PIC_MASTER:
5546
- memcpy (&pic_irqchip(kvm)->pics[0],
5548
- sizeof(struct kvm_pic_state));
5550
- case KVM_IRQCHIP_PIC_SLAVE:
5551
- memcpy (&pic_irqchip(kvm)->pics[1],
5553
- sizeof(struct kvm_pic_state));
5555
- case KVM_IRQCHIP_IOAPIC:
5556
- memcpy (ioapic_irqchip(kvm),
5557
- &chip->chip.ioapic,
5558
- sizeof(struct kvm_ioapic_state));
5564
- kvm_pic_update_irq(pic_irqchip(kvm));
5566
+ return PAGE_OFFSET;
5569
-static gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
5570
+int kvm_is_error_hva(unsigned long addr)
5573
- struct kvm_mem_alias *alias;
5575
- for (i = 0; i < kvm->naliases; ++i) {
5576
- alias = &kvm->aliases[i];
5577
- if (gfn >= alias->base_gfn
5578
- && gfn < alias->base_gfn + alias->npages)
5579
- return alias->target_gfn + gfn - alias->base_gfn;
5582
+ return addr == bad_hva();
5584
+EXPORT_SYMBOL_GPL(kvm_is_error_hva);
5586
static struct kvm_memory_slot *__gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
5588
@@ -945,385 +428,213 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
5589
return __gfn_to_memslot(kvm, gfn);
5592
-struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
5593
+int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
5595
- struct kvm_memory_slot *slot;
5598
gfn = unalias_gfn(kvm, gfn);
5599
- slot = __gfn_to_memslot(kvm, gfn);
5602
- return slot->phys_mem[gfn - slot->base_gfn];
5604
-EXPORT_SYMBOL_GPL(gfn_to_page);
5606
-/* WARNING: Does not work on aliased pages. */
5607
-void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
5609
- struct kvm_memory_slot *memslot;
5611
- memslot = __gfn_to_memslot(kvm, gfn);
5612
- if (memslot && memslot->dirty_bitmap) {
5613
- unsigned long rel_gfn = gfn - memslot->base_gfn;
5614
+ for (i = 0; i < KVM_MEMORY_SLOTS; ++i) {
5615
+ struct kvm_memory_slot *memslot = &kvm->memslots[i];
5618
- if (!test_bit(rel_gfn, memslot->dirty_bitmap))
5619
- set_bit(rel_gfn, memslot->dirty_bitmap);
5620
+ if (gfn >= memslot->base_gfn
5621
+ && gfn < memslot->base_gfn + memslot->npages)
5626
+EXPORT_SYMBOL_GPL(kvm_is_visible_gfn);
5628
-int emulator_read_std(unsigned long addr,
5630
- unsigned int bytes,
5631
- struct kvm_vcpu *vcpu)
5632
+static unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
5637
- gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
5638
- unsigned offset = addr & (PAGE_SIZE-1);
5639
- unsigned tocopy = min(bytes, (unsigned)PAGE_SIZE - offset);
5640
- unsigned long pfn;
5641
- struct page *page;
5644
- if (gpa == UNMAPPED_GVA)
5645
- return X86EMUL_PROPAGATE_FAULT;
5646
- pfn = gpa >> PAGE_SHIFT;
5647
- page = gfn_to_page(vcpu->kvm, pfn);
5649
- return X86EMUL_UNHANDLEABLE;
5650
- page_virt = kmap_atomic(page, KM_USER0);
5652
- memcpy(data, page_virt + offset, tocopy);
5654
- kunmap_atomic(page_virt, KM_USER0);
5661
- return X86EMUL_CONTINUE;
5663
-EXPORT_SYMBOL_GPL(emulator_read_std);
5664
+ struct kvm_memory_slot *slot;
5666
-static int emulator_write_std(unsigned long addr,
5668
- unsigned int bytes,
5669
- struct kvm_vcpu *vcpu)
5671
- pr_unimpl(vcpu, "emulator_write_std: addr %lx n %d\n", addr, bytes);
5672
- return X86EMUL_UNHANDLEABLE;
5673
+ gfn = unalias_gfn(kvm, gfn);
5674
+ slot = __gfn_to_memslot(kvm, gfn);
5677
+ return (slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE);
5681
- * Only apic need an MMIO device hook, so shortcut now..
5682
+ * Requires current->mm->mmap_sem to be held
5684
-static struct kvm_io_device *vcpu_find_pervcpu_dev(struct kvm_vcpu *vcpu,
5686
+static struct page *__gfn_to_page(struct kvm *kvm, gfn_t gfn)
5688
- struct kvm_io_device *dev;
5691
- dev = &vcpu->apic->dev;
5692
- if (dev->in_range(dev, addr))
5697
+ struct page *page[1];
5698
+ unsigned long addr;
5701
-static struct kvm_io_device *vcpu_find_mmio_dev(struct kvm_vcpu *vcpu,
5704
- struct kvm_io_device *dev;
5707
- dev = vcpu_find_pervcpu_dev(vcpu, addr);
5709
- dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr);
5713
-static struct kvm_io_device *vcpu_find_pio_dev(struct kvm_vcpu *vcpu,
5716
- return kvm_io_bus_find_dev(&vcpu->kvm->pio_bus, addr);
5719
-static int emulator_read_emulated(unsigned long addr,
5721
- unsigned int bytes,
5722
- struct kvm_vcpu *vcpu)
5724
- struct kvm_io_device *mmio_dev;
5727
- if (vcpu->mmio_read_completed) {
5728
- memcpy(val, vcpu->mmio_data, bytes);
5729
- vcpu->mmio_read_completed = 0;
5730
- return X86EMUL_CONTINUE;
5731
- } else if (emulator_read_std(addr, val, bytes, vcpu)
5732
- == X86EMUL_CONTINUE)
5733
- return X86EMUL_CONTINUE;
5734
+ addr = gfn_to_hva(kvm, gfn);
5735
+ if (kvm_is_error_hva(addr)) {
5736
+ get_page(bad_page);
5740
- gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
5741
- if (gpa == UNMAPPED_GVA)
5742
- return X86EMUL_PROPAGATE_FAULT;
5743
+ npages = get_user_pages(current, current->mm, addr, 1, 1, 0, page,
5747
- * Is this MMIO handled locally?
5749
- mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
5751
- kvm_iodevice_read(mmio_dev, gpa, bytes, val);
5752
- return X86EMUL_CONTINUE;
5753
+ if (npages != 1) {
5754
+ get_page(bad_page);
5758
- vcpu->mmio_needed = 1;
5759
- vcpu->mmio_phys_addr = gpa;
5760
- vcpu->mmio_size = bytes;
5761
- vcpu->mmio_is_write = 0;
5763
- return X86EMUL_UNHANDLEABLE;
5767
-static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
5768
- const void *val, int bytes)
5769
+struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
5774
- if (((gpa + bytes - 1) >> PAGE_SHIFT) != (gpa >> PAGE_SHIFT))
5776
- page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
5779
- mark_page_dirty(vcpu->kvm, gpa >> PAGE_SHIFT);
5780
- virt = kmap_atomic(page, KM_USER0);
5781
- kvm_mmu_pte_write(vcpu, gpa, val, bytes);
5782
- memcpy(virt + offset_in_page(gpa), val, bytes);
5783
- kunmap_atomic(virt, KM_USER0);
5787
-static int emulator_write_emulated_onepage(unsigned long addr,
5789
- unsigned int bytes,
5790
- struct kvm_vcpu *vcpu)
5792
- struct kvm_io_device *mmio_dev;
5793
- gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
5795
- if (gpa == UNMAPPED_GVA) {
5796
- kvm_x86_ops->inject_page_fault(vcpu, addr, 2);
5797
- return X86EMUL_PROPAGATE_FAULT;
5800
- if (emulator_write_phys(vcpu, gpa, val, bytes))
5801
- return X86EMUL_CONTINUE;
5802
+ down_read(¤t->mm->mmap_sem);
5803
+ page = __gfn_to_page(kvm, gfn);
5804
+ up_read(¤t->mm->mmap_sem);
5807
- * Is this MMIO handled locally?
5809
- mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
5811
- kvm_iodevice_write(mmio_dev, gpa, bytes, val);
5812
- return X86EMUL_CONTINUE;
5815
- vcpu->mmio_needed = 1;
5816
- vcpu->mmio_phys_addr = gpa;
5817
- vcpu->mmio_size = bytes;
5818
- vcpu->mmio_is_write = 1;
5819
- memcpy(vcpu->mmio_data, val, bytes);
5821
- return X86EMUL_CONTINUE;
5825
-int emulator_write_emulated(unsigned long addr,
5827
- unsigned int bytes,
5828
- struct kvm_vcpu *vcpu)
5830
- /* Crossing a page boundary? */
5831
- if (((addr + bytes - 1) ^ addr) & PAGE_MASK) {
5834
- now = -addr & ~PAGE_MASK;
5835
- rc = emulator_write_emulated_onepage(addr, val, now, vcpu);
5836
- if (rc != X86EMUL_CONTINUE)
5842
- return emulator_write_emulated_onepage(addr, val, bytes, vcpu);
5844
-EXPORT_SYMBOL_GPL(emulator_write_emulated);
5845
+EXPORT_SYMBOL_GPL(gfn_to_page);
5847
-static int emulator_cmpxchg_emulated(unsigned long addr,
5850
- unsigned int bytes,
5851
- struct kvm_vcpu *vcpu)
5852
+void kvm_release_page_clean(struct page *page)
5854
- static int reported;
5858
- printk(KERN_WARNING "kvm: emulating exchange as write\n");
5860
- return emulator_write_emulated(addr, new, bytes, vcpu);
5863
+EXPORT_SYMBOL_GPL(kvm_release_page_clean);
5865
-static unsigned long get_segment_base(struct kvm_vcpu *vcpu, int seg)
5866
+void kvm_release_page_dirty(struct page *page)
5868
- return kvm_x86_ops->get_segment_base(vcpu, seg);
5869
+ if (!PageReserved(page))
5870
+ SetPageDirty(page);
5873
+EXPORT_SYMBOL_GPL(kvm_release_page_dirty);
5875
-int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address)
5876
+static int next_segment(unsigned long len, int offset)
5878
- return X86EMUL_CONTINUE;
5879
+ if (len > PAGE_SIZE - offset)
5880
+ return PAGE_SIZE - offset;
5885
-int emulate_clts(struct kvm_vcpu *vcpu)
5886
+int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
5889
- kvm_x86_ops->set_cr0(vcpu, vcpu->cr0 & ~X86_CR0_TS);
5890
- return X86EMUL_CONTINUE;
5892
+ unsigned long addr;
5894
+ addr = gfn_to_hva(kvm, gfn);
5895
+ if (kvm_is_error_hva(addr))
5897
+ r = copy_from_user(data, (void __user *)addr + offset, len);
5902
+EXPORT_SYMBOL_GPL(kvm_read_guest_page);
5904
-int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr, unsigned long *dest)
5905
+int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len)
5907
- struct kvm_vcpu *vcpu = ctxt->vcpu;
5908
+ gfn_t gfn = gpa >> PAGE_SHIFT;
5910
+ int offset = offset_in_page(gpa);
5915
- *dest = kvm_x86_ops->get_dr(vcpu, dr);
5916
- return X86EMUL_CONTINUE;
5918
- pr_unimpl(vcpu, "%s: unexpected dr %u\n", __FUNCTION__, dr);
5919
- return X86EMUL_UNHANDLEABLE;
5920
+ while ((seg = next_segment(len, offset)) != 0) {
5921
+ ret = kvm_read_guest_page(kvm, gfn, data, offset, seg);
5931
+EXPORT_SYMBOL_GPL(kvm_read_guest);
5933
-int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value)
5934
+int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
5935
+ int offset, int len)
5937
- unsigned long mask = (ctxt->mode == X86EMUL_MODE_PROT64) ? ~0ULL : ~0U;
5940
+ unsigned long addr;
5942
- kvm_x86_ops->set_dr(ctxt->vcpu, dr, value & mask, &exception);
5944
- /* FIXME: better handling */
5945
- return X86EMUL_UNHANDLEABLE;
5947
- return X86EMUL_CONTINUE;
5948
+ addr = gfn_to_hva(kvm, gfn);
5949
+ if (kvm_is_error_hva(addr))
5951
+ r = copy_to_user((void __user *)addr + offset, data, len);
5954
+ mark_page_dirty(kvm, gfn);
5957
+EXPORT_SYMBOL_GPL(kvm_write_guest_page);
5959
-void kvm_report_emulation_failure(struct kvm_vcpu *vcpu, const char *context)
5960
+int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
5961
+ unsigned long len)
5963
- static int reported;
5965
- unsigned long rip = vcpu->rip;
5966
- unsigned long rip_linear;
5968
- rip_linear = rip + get_segment_base(vcpu, VCPU_SREG_CS);
5973
- emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu);
5974
+ gfn_t gfn = gpa >> PAGE_SHIFT;
5976
+ int offset = offset_in_page(gpa);
5979
- printk(KERN_ERR "emulation failed (%s) rip %lx %02x %02x %02x %02x\n",
5980
- context, rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]);
5982
+ while ((seg = next_segment(len, offset)) != 0) {
5983
+ ret = kvm_write_guest_page(kvm, gfn, data, offset, seg);
5993
-EXPORT_SYMBOL_GPL(kvm_report_emulation_failure);
5995
-struct x86_emulate_ops emulate_ops = {
5996
- .read_std = emulator_read_std,
5997
- .write_std = emulator_write_std,
5998
- .read_emulated = emulator_read_emulated,
5999
- .write_emulated = emulator_write_emulated,
6000
- .cmpxchg_emulated = emulator_cmpxchg_emulated,
6002
+int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
6004
+ return kvm_write_guest_page(kvm, gfn, empty_zero_page, offset, len);
6006
+EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
6008
-int emulate_instruction(struct kvm_vcpu *vcpu,
6009
- struct kvm_run *run,
6010
- unsigned long cr2,
6012
+int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len)
6014
- struct x86_emulate_ctxt emulate_ctxt;
6018
- vcpu->mmio_fault_cr2 = cr2;
6019
- kvm_x86_ops->cache_regs(vcpu);
6021
- kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
6023
- emulate_ctxt.vcpu = vcpu;
6024
- emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
6025
- emulate_ctxt.cr2 = cr2;
6026
- emulate_ctxt.mode = (emulate_ctxt.eflags & X86_EFLAGS_VM)
6027
- ? X86EMUL_MODE_REAL : cs_l
6028
- ? X86EMUL_MODE_PROT64 : cs_db
6029
- ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
6031
- if (emulate_ctxt.mode == X86EMUL_MODE_PROT64) {
6032
- emulate_ctxt.cs_base = 0;
6033
- emulate_ctxt.ds_base = 0;
6034
- emulate_ctxt.es_base = 0;
6035
- emulate_ctxt.ss_base = 0;
6037
- emulate_ctxt.cs_base = get_segment_base(vcpu, VCPU_SREG_CS);
6038
- emulate_ctxt.ds_base = get_segment_base(vcpu, VCPU_SREG_DS);
6039
- emulate_ctxt.es_base = get_segment_base(vcpu, VCPU_SREG_ES);
6040
- emulate_ctxt.ss_base = get_segment_base(vcpu, VCPU_SREG_SS);
6042
+ gfn_t gfn = gpa >> PAGE_SHIFT;
6044
+ int offset = offset_in_page(gpa);
6047
- emulate_ctxt.gs_base = get_segment_base(vcpu, VCPU_SREG_GS);
6048
- emulate_ctxt.fs_base = get_segment_base(vcpu, VCPU_SREG_FS);
6050
- vcpu->mmio_is_write = 0;
6051
- vcpu->pio.string = 0;
6052
- r = x86_emulate_memop(&emulate_ctxt, &emulate_ops);
6053
- if (vcpu->pio.string)
6054
- return EMULATE_DO_MMIO;
6056
- if ((r || vcpu->mmio_is_write) && run) {
6057
- run->exit_reason = KVM_EXIT_MMIO;
6058
- run->mmio.phys_addr = vcpu->mmio_phys_addr;
6059
- memcpy(run->mmio.data, vcpu->mmio_data, 8);
6060
- run->mmio.len = vcpu->mmio_size;
6061
- run->mmio.is_write = vcpu->mmio_is_write;
6062
+ while ((seg = next_segment(len, offset)) != 0) {
6063
+ ret = kvm_clear_guest_page(kvm, gfn, offset, seg);
6072
+EXPORT_SYMBOL_GPL(kvm_clear_guest);
6075
- if (kvm_mmu_unprotect_page_virt(vcpu, cr2))
6076
- return EMULATE_DONE;
6077
- if (!vcpu->mmio_needed) {
6078
- kvm_report_emulation_failure(vcpu, "mmio");
6079
- return EMULATE_FAIL;
6081
- return EMULATE_DO_MMIO;
6083
+void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
6085
+ struct kvm_memory_slot *memslot;
6087
- kvm_x86_ops->decache_regs(vcpu);
6088
- kvm_x86_ops->set_rflags(vcpu, emulate_ctxt.eflags);
6089
+ gfn = unalias_gfn(kvm, gfn);
6090
+ memslot = __gfn_to_memslot(kvm, gfn);
6091
+ if (memslot && memslot->dirty_bitmap) {
6092
+ unsigned long rel_gfn = gfn - memslot->base_gfn;
6094
- if (vcpu->mmio_is_write) {
6095
- vcpu->mmio_needed = 0;
6096
- return EMULATE_DO_MMIO;
6098
+ if (!test_bit(rel_gfn, memslot->dirty_bitmap))
6099
+ set_bit(rel_gfn, memslot->dirty_bitmap);
6102
- return EMULATE_DONE;
6104
-EXPORT_SYMBOL_GPL(emulate_instruction);
6107
* The vCPU has executed a HLT instruction with in-kernel mode enabled.
6109
-static void kvm_vcpu_block(struct kvm_vcpu *vcpu)
6110
+void kvm_vcpu_block(struct kvm_vcpu *vcpu)
6112
DECLARE_WAITQUEUE(wait, current);
6114
@@ -1346,340 +657,6 @@ static void kvm_vcpu_block(struct kvm_vcpu *vcpu)
6115
remove_wait_queue(&vcpu->wq, &wait);
6118
-int kvm_emulate_halt(struct kvm_vcpu *vcpu)
6120
- ++vcpu->stat.halt_exits;
6121
- if (irqchip_in_kernel(vcpu->kvm)) {
6122
- vcpu->mp_state = VCPU_MP_STATE_HALTED;
6123
- kvm_vcpu_block(vcpu);
6124
- if (vcpu->mp_state != VCPU_MP_STATE_RUNNABLE)
6128
- vcpu->run->exit_reason = KVM_EXIT_HLT;
6132
-EXPORT_SYMBOL_GPL(kvm_emulate_halt);
6134
-int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
6136
- unsigned long nr, a0, a1, a2, a3, a4, a5, ret;
6138
- kvm_x86_ops->cache_regs(vcpu);
6139
- ret = -KVM_EINVAL;
6140
-#ifdef CONFIG_X86_64
6141
- if (is_long_mode(vcpu)) {
6142
- nr = vcpu->regs[VCPU_REGS_RAX];
6143
- a0 = vcpu->regs[VCPU_REGS_RDI];
6144
- a1 = vcpu->regs[VCPU_REGS_RSI];
6145
- a2 = vcpu->regs[VCPU_REGS_RDX];
6146
- a3 = vcpu->regs[VCPU_REGS_RCX];
6147
- a4 = vcpu->regs[VCPU_REGS_R8];
6148
- a5 = vcpu->regs[VCPU_REGS_R9];
6152
- nr = vcpu->regs[VCPU_REGS_RBX] & -1u;
6153
- a0 = vcpu->regs[VCPU_REGS_RAX] & -1u;
6154
- a1 = vcpu->regs[VCPU_REGS_RCX] & -1u;
6155
- a2 = vcpu->regs[VCPU_REGS_RDX] & -1u;
6156
- a3 = vcpu->regs[VCPU_REGS_RSI] & -1u;
6157
- a4 = vcpu->regs[VCPU_REGS_RDI] & -1u;
6158
- a5 = vcpu->regs[VCPU_REGS_RBP] & -1u;
6162
- run->hypercall.nr = nr;
6163
- run->hypercall.args[0] = a0;
6164
- run->hypercall.args[1] = a1;
6165
- run->hypercall.args[2] = a2;
6166
- run->hypercall.args[3] = a3;
6167
- run->hypercall.args[4] = a4;
6168
- run->hypercall.args[5] = a5;
6169
- run->hypercall.ret = ret;
6170
- run->hypercall.longmode = is_long_mode(vcpu);
6171
- kvm_x86_ops->decache_regs(vcpu);
6174
- vcpu->regs[VCPU_REGS_RAX] = ret;
6175
- kvm_x86_ops->decache_regs(vcpu);
6178
-EXPORT_SYMBOL_GPL(kvm_hypercall);
6180
-static u64 mk_cr_64(u64 curr_cr, u32 new_val)
6182
- return (curr_cr & ~((1ULL << 32) - 1)) | new_val;
6185
-void realmode_lgdt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
6187
- struct descriptor_table dt = { limit, base };
6189
- kvm_x86_ops->set_gdt(vcpu, &dt);
6192
-void realmode_lidt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
6194
- struct descriptor_table dt = { limit, base };
6196
- kvm_x86_ops->set_idt(vcpu, &dt);
6199
-void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
6200
- unsigned long *rflags)
6203
- *rflags = kvm_x86_ops->get_rflags(vcpu);
6206
-unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr)
6208
- kvm_x86_ops->decache_cr4_guest_bits(vcpu);
6219
- vcpu_printf(vcpu, "%s: unexpected cr %u\n", __FUNCTION__, cr);
6224
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val,
6225
- unsigned long *rflags)
6229
- set_cr0(vcpu, mk_cr_64(vcpu->cr0, val));
6230
- *rflags = kvm_x86_ops->get_rflags(vcpu);
6236
- set_cr3(vcpu, val);
6239
- set_cr4(vcpu, mk_cr_64(vcpu->cr4, val));
6242
- vcpu_printf(vcpu, "%s: unexpected cr %u\n", __FUNCTION__, cr);
6247
- * Register the para guest with the host:
6249
-static int vcpu_register_para(struct kvm_vcpu *vcpu, gpa_t para_state_gpa)
6251
- struct kvm_vcpu_para_state *para_state;
6252
- hpa_t para_state_hpa, hypercall_hpa;
6253
- struct page *para_state_page;
6254
- unsigned char *hypercall;
6255
- gpa_t hypercall_gpa;
6257
- printk(KERN_DEBUG "kvm: guest trying to enter paravirtual mode\n");
6258
- printk(KERN_DEBUG ".... para_state_gpa: %08Lx\n", para_state_gpa);
6261
- * Needs to be page aligned:
6263
- if (para_state_gpa != PAGE_ALIGN(para_state_gpa))
6266
- para_state_hpa = gpa_to_hpa(vcpu, para_state_gpa);
6267
- printk(KERN_DEBUG ".... para_state_hpa: %08Lx\n", para_state_hpa);
6268
- if (is_error_hpa(para_state_hpa))
6271
- mark_page_dirty(vcpu->kvm, para_state_gpa >> PAGE_SHIFT);
6272
- para_state_page = pfn_to_page(para_state_hpa >> PAGE_SHIFT);
6273
- para_state = kmap(para_state_page);
6275
- printk(KERN_DEBUG ".... guest version: %d\n", para_state->guest_version);
6276
- printk(KERN_DEBUG ".... size: %d\n", para_state->size);
6278
- para_state->host_version = KVM_PARA_API_VERSION;
6280
- * We cannot support guests that try to register themselves
6281
- * with a newer API version than the host supports:
6283
- if (para_state->guest_version > KVM_PARA_API_VERSION) {
6284
- para_state->ret = -KVM_EINVAL;
6285
- goto err_kunmap_skip;
6288
- hypercall_gpa = para_state->hypercall_gpa;
6289
- hypercall_hpa = gpa_to_hpa(vcpu, hypercall_gpa);
6290
- printk(KERN_DEBUG ".... hypercall_hpa: %08Lx\n", hypercall_hpa);
6291
- if (is_error_hpa(hypercall_hpa)) {
6292
- para_state->ret = -KVM_EINVAL;
6293
- goto err_kunmap_skip;
6296
- printk(KERN_DEBUG "kvm: para guest successfully registered.\n");
6297
- vcpu->para_state_page = para_state_page;
6298
- vcpu->para_state_gpa = para_state_gpa;
6299
- vcpu->hypercall_gpa = hypercall_gpa;
6301
- mark_page_dirty(vcpu->kvm, hypercall_gpa >> PAGE_SHIFT);
6302
- hypercall = kmap_atomic(pfn_to_page(hypercall_hpa >> PAGE_SHIFT),
6303
- KM_USER1) + (hypercall_hpa & ~PAGE_MASK);
6304
- kvm_x86_ops->patch_hypercall(vcpu, hypercall);
6305
- kunmap_atomic(hypercall, KM_USER1);
6307
- para_state->ret = 0;
6309
- kunmap(para_state_page);
6315
-int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
6320
- case 0xc0010010: /* SYSCFG */
6321
- case 0xc0010015: /* HWCR */
6322
- case MSR_IA32_PLATFORM_ID:
6323
- case MSR_IA32_P5_MC_ADDR:
6324
- case MSR_IA32_P5_MC_TYPE:
6325
- case MSR_IA32_MC0_CTL:
6326
- case MSR_IA32_MCG_STATUS:
6327
- case MSR_IA32_MCG_CAP:
6328
- case MSR_IA32_MC0_MISC:
6329
- case MSR_IA32_MC0_MISC+4:
6330
- case MSR_IA32_MC0_MISC+8:
6331
- case MSR_IA32_MC0_MISC+12:
6332
- case MSR_IA32_MC0_MISC+16:
6333
- case MSR_IA32_UCODE_REV:
6334
- case MSR_IA32_PERF_STATUS:
6335
- case MSR_IA32_EBL_CR_POWERON:
6336
- /* MTRR registers */
6338
- case 0x200 ... 0x2ff:
6341
- case 0xcd: /* fsb frequency */
6344
- case MSR_IA32_APICBASE:
6345
- data = kvm_get_apic_base(vcpu);
6347
- case MSR_IA32_MISC_ENABLE:
6348
- data = vcpu->ia32_misc_enable_msr;
6350
-#ifdef CONFIG_X86_64
6352
- data = vcpu->shadow_efer;
6356
- pr_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", msr);
6362
-EXPORT_SYMBOL_GPL(kvm_get_msr_common);
6365
- * Reads an msr value (of 'msr_index') into 'pdata'.
6366
- * Returns 0 on success, non-0 otherwise.
6367
- * Assumes vcpu_load() was already called.
6369
-int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
6371
- return kvm_x86_ops->get_msr(vcpu, msr_index, pdata);
6374
-#ifdef CONFIG_X86_64
6376
-static void set_efer(struct kvm_vcpu *vcpu, u64 efer)
6378
- if (efer & EFER_RESERVED_BITS) {
6379
- printk(KERN_DEBUG "set_efer: 0x%llx #GP, reserved bits\n",
6385
- if (is_paging(vcpu)
6386
- && (vcpu->shadow_efer & EFER_LME) != (efer & EFER_LME)) {
6387
- printk(KERN_DEBUG "set_efer: #GP, change LME while paging\n");
6392
- kvm_x86_ops->set_efer(vcpu, efer);
6394
- efer &= ~EFER_LMA;
6395
- efer |= vcpu->shadow_efer & EFER_LMA;
6397
- vcpu->shadow_efer = efer;
6402
-int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
6405
-#ifdef CONFIG_X86_64
6407
- set_efer(vcpu, data);
6410
- case MSR_IA32_MC0_STATUS:
6411
- pr_unimpl(vcpu, "%s: MSR_IA32_MC0_STATUS 0x%llx, nop\n",
6412
- __FUNCTION__, data);
6414
- case MSR_IA32_MCG_STATUS:
6415
- pr_unimpl(vcpu, "%s: MSR_IA32_MCG_STATUS 0x%llx, nop\n",
6416
- __FUNCTION__, data);
6418
- case MSR_IA32_UCODE_REV:
6419
- case MSR_IA32_UCODE_WRITE:
6420
- case 0x200 ... 0x2ff: /* MTRRs */
6422
- case MSR_IA32_APICBASE:
6423
- kvm_set_apic_base(vcpu, data);
6425
- case MSR_IA32_MISC_ENABLE:
6426
- vcpu->ia32_misc_enable_msr = data;
6429
- * This is the 'probe whether the host is KVM' logic:
6431
- case MSR_KVM_API_MAGIC:
6432
- return vcpu_register_para(vcpu, data);
6435
- pr_unimpl(vcpu, "unhandled wrmsr: 0x%x\n", msr);
6440
-EXPORT_SYMBOL_GPL(kvm_set_msr_common);
6443
- * Writes msr value into into the appropriate "register".
6444
- * Returns 0 on success, non-0 otherwise.
6445
- * Assumes vcpu_load() was already called.
6447
-int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
6449
- return kvm_x86_ops->set_msr(vcpu, msr_index, data);
6452
void kvm_resched(struct kvm_vcpu *vcpu)
6454
if (!need_resched())
6455
@@ -1688,851 +665,6 @@ void kvm_resched(struct kvm_vcpu *vcpu)
6457
EXPORT_SYMBOL_GPL(kvm_resched);
6459
-void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
6463
- struct kvm_cpuid_entry *e, *best;
6465
- kvm_x86_ops->cache_regs(vcpu);
6466
- function = vcpu->regs[VCPU_REGS_RAX];
6467
- vcpu->regs[VCPU_REGS_RAX] = 0;
6468
- vcpu->regs[VCPU_REGS_RBX] = 0;
6469
- vcpu->regs[VCPU_REGS_RCX] = 0;
6470
- vcpu->regs[VCPU_REGS_RDX] = 0;
6472
- for (i = 0; i < vcpu->cpuid_nent; ++i) {
6473
- e = &vcpu->cpuid_entries[i];
6474
- if (e->function == function) {
6479
- * Both basic or both extended?
6481
- if (((e->function ^ function) & 0x80000000) == 0)
6482
- if (!best || e->function > best->function)
6486
- vcpu->regs[VCPU_REGS_RAX] = best->eax;
6487
- vcpu->regs[VCPU_REGS_RBX] = best->ebx;
6488
- vcpu->regs[VCPU_REGS_RCX] = best->ecx;
6489
- vcpu->regs[VCPU_REGS_RDX] = best->edx;
6491
- kvm_x86_ops->decache_regs(vcpu);
6492
- kvm_x86_ops->skip_emulated_instruction(vcpu);
6494
-EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
6496
-static int pio_copy_data(struct kvm_vcpu *vcpu)
6498
- void *p = vcpu->pio_data;
6501
- int nr_pages = vcpu->pio.guest_pages[1] ? 2 : 1;
6503
- q = vmap(vcpu->pio.guest_pages, nr_pages, VM_READ|VM_WRITE,
6506
- free_pio_guest_pages(vcpu);
6509
- q += vcpu->pio.guest_page_offset;
6510
- bytes = vcpu->pio.size * vcpu->pio.cur_count;
6512
- memcpy(q, p, bytes);
6514
- memcpy(p, q, bytes);
6515
- q -= vcpu->pio.guest_page_offset;
6517
- free_pio_guest_pages(vcpu);
6521
-static int complete_pio(struct kvm_vcpu *vcpu)
6523
- struct kvm_pio_request *io = &vcpu->pio;
6527
- kvm_x86_ops->cache_regs(vcpu);
6529
- if (!io->string) {
6531
- memcpy(&vcpu->regs[VCPU_REGS_RAX], vcpu->pio_data,
6535
- r = pio_copy_data(vcpu);
6537
- kvm_x86_ops->cache_regs(vcpu);
6544
- delta *= io->cur_count;
6546
- * The size of the register should really depend on
6547
- * current address size.
6549
- vcpu->regs[VCPU_REGS_RCX] -= delta;
6553
- delta *= io->size;
6555
- vcpu->regs[VCPU_REGS_RDI] += delta;
6557
- vcpu->regs[VCPU_REGS_RSI] += delta;
6560
- kvm_x86_ops->decache_regs(vcpu);
6562
- io->count -= io->cur_count;
6563
- io->cur_count = 0;
6568
-static void kernel_pio(struct kvm_io_device *pio_dev,
6569
- struct kvm_vcpu *vcpu,
6572
- /* TODO: String I/O for in kernel device */
6574
- mutex_lock(&vcpu->kvm->lock);
6576
- kvm_iodevice_read(pio_dev, vcpu->pio.port,
6580
- kvm_iodevice_write(pio_dev, vcpu->pio.port,
6583
- mutex_unlock(&vcpu->kvm->lock);
6586
-static void pio_string_write(struct kvm_io_device *pio_dev,
6587
- struct kvm_vcpu *vcpu)
6589
- struct kvm_pio_request *io = &vcpu->pio;
6590
- void *pd = vcpu->pio_data;
6593
- mutex_lock(&vcpu->kvm->lock);
6594
- for (i = 0; i < io->cur_count; i++) {
6595
- kvm_iodevice_write(pio_dev, io->port,
6600
- mutex_unlock(&vcpu->kvm->lock);
6603
-int kvm_emulate_pio (struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
6604
- int size, unsigned port)
6606
- struct kvm_io_device *pio_dev;
6608
- vcpu->run->exit_reason = KVM_EXIT_IO;
6609
- vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
6610
- vcpu->run->io.size = vcpu->pio.size = size;
6611
- vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
6612
- vcpu->run->io.count = vcpu->pio.count = vcpu->pio.cur_count = 1;
6613
- vcpu->run->io.port = vcpu->pio.port = port;
6614
- vcpu->pio.in = in;
6615
- vcpu->pio.string = 0;
6616
- vcpu->pio.down = 0;
6617
- vcpu->pio.guest_page_offset = 0;
6618
- vcpu->pio.rep = 0;
6620
- kvm_x86_ops->cache_regs(vcpu);
6621
- memcpy(vcpu->pio_data, &vcpu->regs[VCPU_REGS_RAX], 4);
6622
- kvm_x86_ops->decache_regs(vcpu);
6624
- kvm_x86_ops->skip_emulated_instruction(vcpu);
6626
- pio_dev = vcpu_find_pio_dev(vcpu, port);
6628
- kernel_pio(pio_dev, vcpu, vcpu->pio_data);
6629
- complete_pio(vcpu);
6634
-EXPORT_SYMBOL_GPL(kvm_emulate_pio);
6636
-int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
6637
- int size, unsigned long count, int down,
6638
- gva_t address, int rep, unsigned port)
6640
- unsigned now, in_page;
6643
- struct page *page;
6644
- struct kvm_io_device *pio_dev;
6646
- vcpu->run->exit_reason = KVM_EXIT_IO;
6647
- vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
6648
- vcpu->run->io.size = vcpu->pio.size = size;
6649
- vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
6650
- vcpu->run->io.count = vcpu->pio.count = vcpu->pio.cur_count = count;
6651
- vcpu->run->io.port = vcpu->pio.port = port;
6652
- vcpu->pio.in = in;
6653
- vcpu->pio.string = 1;
6654
- vcpu->pio.down = down;
6655
- vcpu->pio.guest_page_offset = offset_in_page(address);
6656
- vcpu->pio.rep = rep;
6659
- kvm_x86_ops->skip_emulated_instruction(vcpu);
6664
- in_page = PAGE_SIZE - offset_in_page(address);
6666
- in_page = offset_in_page(address) + size;
6667
- now = min(count, (unsigned long)in_page / size);
6670
- * String I/O straddles page boundary. Pin two guest pages
6671
- * so that we satisfy atomicity constraints. Do just one
6672
- * transaction to avoid complexity.
6679
- * String I/O in reverse. Yuck. Kill the guest, fix later.
6681
- pr_unimpl(vcpu, "guest string pio down\n");
6685
- vcpu->run->io.count = now;
6686
- vcpu->pio.cur_count = now;
6688
- if (vcpu->pio.cur_count == vcpu->pio.count)
6689
- kvm_x86_ops->skip_emulated_instruction(vcpu);
6691
- for (i = 0; i < nr_pages; ++i) {
6692
- mutex_lock(&vcpu->kvm->lock);
6693
- page = gva_to_page(vcpu, address + i * PAGE_SIZE);
6696
- vcpu->pio.guest_pages[i] = page;
6697
- mutex_unlock(&vcpu->kvm->lock);
6700
- free_pio_guest_pages(vcpu);
6705
- pio_dev = vcpu_find_pio_dev(vcpu, port);
6706
- if (!vcpu->pio.in) {
6707
- /* string PIO write */
6708
- ret = pio_copy_data(vcpu);
6709
- if (ret >= 0 && pio_dev) {
6710
- pio_string_write(pio_dev, vcpu);
6711
- complete_pio(vcpu);
6712
- if (vcpu->pio.count == 0)
6715
- } else if (pio_dev)
6716
- pr_unimpl(vcpu, "no string pio read support yet, "
6717
- "port %x size %d count %ld\n",
6718
- port, size, count);
6722
-EXPORT_SYMBOL_GPL(kvm_emulate_pio_string);
6725
- * Check if userspace requested an interrupt window, and that the
6726
- * interrupt window is open.
6728
- * No need to exit to userspace if we already have an interrupt queued.
6730
-static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu,
6731
- struct kvm_run *kvm_run)
6733
- return (!vcpu->irq_summary &&
6734
- kvm_run->request_interrupt_window &&
6735
- vcpu->interrupt_window_open &&
6736
- (kvm_x86_ops->get_rflags(vcpu) & X86_EFLAGS_IF));
6739
-static void post_kvm_run_save(struct kvm_vcpu *vcpu,
6740
- struct kvm_run *kvm_run)
6742
- kvm_run->if_flag = (kvm_x86_ops->get_rflags(vcpu) & X86_EFLAGS_IF) != 0;
6743
- kvm_run->cr8 = get_cr8(vcpu);
6744
- kvm_run->apic_base = kvm_get_apic_base(vcpu);
6745
- if (irqchip_in_kernel(vcpu->kvm))
6746
- kvm_run->ready_for_interrupt_injection = 1;
6748
- kvm_run->ready_for_interrupt_injection =
6749
- (vcpu->interrupt_window_open &&
6750
- vcpu->irq_summary == 0);
6753
-static int __vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
6757
- if (unlikely(vcpu->mp_state == VCPU_MP_STATE_SIPI_RECEIVED)) {
6758
- printk("vcpu %d received sipi with vector # %x\n",
6759
- vcpu->vcpu_id, vcpu->sipi_vector);
6760
- kvm_lapic_reset(vcpu);
6761
- kvm_x86_ops->vcpu_reset(vcpu);
6762
- vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
6766
- if (vcpu->guest_debug.enabled)
6767
- kvm_x86_ops->guest_debug_pre(vcpu);
6770
- r = kvm_mmu_reload(vcpu);
6774
- preempt_disable();
6776
- kvm_x86_ops->prepare_guest_switch(vcpu);
6777
- kvm_load_guest_fpu(vcpu);
6779
- local_irq_disable();
6781
- if (signal_pending(current)) {
6782
- local_irq_enable();
6785
- kvm_run->exit_reason = KVM_EXIT_INTR;
6786
- ++vcpu->stat.signal_exits;
6790
- if (irqchip_in_kernel(vcpu->kvm))
6791
- kvm_x86_ops->inject_pending_irq(vcpu);
6792
- else if (!vcpu->mmio_read_completed)
6793
- kvm_x86_ops->inject_pending_vectors(vcpu, kvm_run);
6795
- vcpu->guest_mode = 1;
6796
- kvm_guest_enter();
6798
- if (vcpu->requests)
6799
- if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests))
6800
- kvm_x86_ops->tlb_flush(vcpu);
6802
- kvm_x86_ops->run(vcpu, kvm_run);
6804
- vcpu->guest_mode = 0;
6805
- local_irq_enable();
6807
- ++vcpu->stat.exits;
6810
- * We must have an instruction between local_irq_enable() and
6811
- * kvm_guest_exit(), so the timer interrupt isn't delayed by
6812
- * the interrupt shadow. The stat.exits increment will do nicely.
6813
- * But we need to prevent reordering, hence this barrier():
6822
- * Profile KVM exit RIPs:
6824
- if (unlikely(prof_on == KVM_PROFILING)) {
6825
- kvm_x86_ops->cache_regs(vcpu);
6826
- profile_hit(KVM_PROFILING, (void *)vcpu->rip);
6829
- r = kvm_x86_ops->handle_exit(kvm_run, vcpu);
6832
- if (dm_request_for_irq_injection(vcpu, kvm_run)) {
6834
- kvm_run->exit_reason = KVM_EXIT_INTR;
6835
- ++vcpu->stat.request_irq_exits;
6838
- if (!need_resched()) {
6839
- ++vcpu->stat.light_exits;
6846
- kvm_resched(vcpu);
6850
- post_kvm_run_save(vcpu, kvm_run);
6856
-static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
6859
- sigset_t sigsaved;
6863
- if (unlikely(vcpu->mp_state == VCPU_MP_STATE_UNINITIALIZED)) {
6864
- kvm_vcpu_block(vcpu);
6869
- if (vcpu->sigset_active)
6870
- sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
6872
- /* re-sync apic's tpr */
6873
- if (!irqchip_in_kernel(vcpu->kvm))
6874
- set_cr8(vcpu, kvm_run->cr8);
6876
- if (vcpu->pio.cur_count) {
6877
- r = complete_pio(vcpu);
6882
- if (vcpu->mmio_needed) {
6883
- memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
6884
- vcpu->mmio_read_completed = 1;
6885
- vcpu->mmio_needed = 0;
6886
- r = emulate_instruction(vcpu, kvm_run,
6887
- vcpu->mmio_fault_cr2, 0);
6888
- if (r == EMULATE_DO_MMIO) {
6890
- * Read-modify-write. Back to userspace.
6897
- if (kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
6898
- kvm_x86_ops->cache_regs(vcpu);
6899
- vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
6900
- kvm_x86_ops->decache_regs(vcpu);
6903
- r = __vcpu_run(vcpu, kvm_run);
6906
- if (vcpu->sigset_active)
6907
- sigprocmask(SIG_SETMASK, &sigsaved, NULL);
6913
-static int kvm_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu,
6914
- struct kvm_regs *regs)
6918
- kvm_x86_ops->cache_regs(vcpu);
6920
- regs->rax = vcpu->regs[VCPU_REGS_RAX];
6921
- regs->rbx = vcpu->regs[VCPU_REGS_RBX];
6922
- regs->rcx = vcpu->regs[VCPU_REGS_RCX];
6923
- regs->rdx = vcpu->regs[VCPU_REGS_RDX];
6924
- regs->rsi = vcpu->regs[VCPU_REGS_RSI];
6925
- regs->rdi = vcpu->regs[VCPU_REGS_RDI];
6926
- regs->rsp = vcpu->regs[VCPU_REGS_RSP];
6927
- regs->rbp = vcpu->regs[VCPU_REGS_RBP];
6928
-#ifdef CONFIG_X86_64
6929
- regs->r8 = vcpu->regs[VCPU_REGS_R8];
6930
- regs->r9 = vcpu->regs[VCPU_REGS_R9];
6931
- regs->r10 = vcpu->regs[VCPU_REGS_R10];
6932
- regs->r11 = vcpu->regs[VCPU_REGS_R11];
6933
- regs->r12 = vcpu->regs[VCPU_REGS_R12];
6934
- regs->r13 = vcpu->regs[VCPU_REGS_R13];
6935
- regs->r14 = vcpu->regs[VCPU_REGS_R14];
6936
- regs->r15 = vcpu->regs[VCPU_REGS_R15];
6939
- regs->rip = vcpu->rip;
6940
- regs->rflags = kvm_x86_ops->get_rflags(vcpu);
6943
- * Don't leak debug flags in case they were set for guest debugging
6945
- if (vcpu->guest_debug.enabled && vcpu->guest_debug.singlestep)
6946
- regs->rflags &= ~(X86_EFLAGS_TF | X86_EFLAGS_RF);
6953
-static int kvm_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu,
6954
- struct kvm_regs *regs)
6958
- vcpu->regs[VCPU_REGS_RAX] = regs->rax;
6959
- vcpu->regs[VCPU_REGS_RBX] = regs->rbx;
6960
- vcpu->regs[VCPU_REGS_RCX] = regs->rcx;
6961
- vcpu->regs[VCPU_REGS_RDX] = regs->rdx;
6962
- vcpu->regs[VCPU_REGS_RSI] = regs->rsi;
6963
- vcpu->regs[VCPU_REGS_RDI] = regs->rdi;
6964
- vcpu->regs[VCPU_REGS_RSP] = regs->rsp;
6965
- vcpu->regs[VCPU_REGS_RBP] = regs->rbp;
6966
-#ifdef CONFIG_X86_64
6967
- vcpu->regs[VCPU_REGS_R8] = regs->r8;
6968
- vcpu->regs[VCPU_REGS_R9] = regs->r9;
6969
- vcpu->regs[VCPU_REGS_R10] = regs->r10;
6970
- vcpu->regs[VCPU_REGS_R11] = regs->r11;
6971
- vcpu->regs[VCPU_REGS_R12] = regs->r12;
6972
- vcpu->regs[VCPU_REGS_R13] = regs->r13;
6973
- vcpu->regs[VCPU_REGS_R14] = regs->r14;
6974
- vcpu->regs[VCPU_REGS_R15] = regs->r15;
6977
- vcpu->rip = regs->rip;
6978
- kvm_x86_ops->set_rflags(vcpu, regs->rflags);
6980
- kvm_x86_ops->decache_regs(vcpu);
6987
-static void get_segment(struct kvm_vcpu *vcpu,
6988
- struct kvm_segment *var, int seg)
6990
- return kvm_x86_ops->get_segment(vcpu, var, seg);
6993
-static int kvm_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
6994
- struct kvm_sregs *sregs)
6996
- struct descriptor_table dt;
7001
- get_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
7002
- get_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
7003
- get_segment(vcpu, &sregs->es, VCPU_SREG_ES);
7004
- get_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
7005
- get_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
7006
- get_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
7008
- get_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
7009
- get_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
7011
- kvm_x86_ops->get_idt(vcpu, &dt);
7012
- sregs->idt.limit = dt.limit;
7013
- sregs->idt.base = dt.base;
7014
- kvm_x86_ops->get_gdt(vcpu, &dt);
7015
- sregs->gdt.limit = dt.limit;
7016
- sregs->gdt.base = dt.base;
7018
- kvm_x86_ops->decache_cr4_guest_bits(vcpu);
7019
- sregs->cr0 = vcpu->cr0;
7020
- sregs->cr2 = vcpu->cr2;
7021
- sregs->cr3 = vcpu->cr3;
7022
- sregs->cr4 = vcpu->cr4;
7023
- sregs->cr8 = get_cr8(vcpu);
7024
- sregs->efer = vcpu->shadow_efer;
7025
- sregs->apic_base = kvm_get_apic_base(vcpu);
7027
- if (irqchip_in_kernel(vcpu->kvm)) {
7028
- memset(sregs->interrupt_bitmap, 0,
7029
- sizeof sregs->interrupt_bitmap);
7030
- pending_vec = kvm_x86_ops->get_irq(vcpu);
7031
- if (pending_vec >= 0)
7032
- set_bit(pending_vec, (unsigned long *)sregs->interrupt_bitmap);
7034
- memcpy(sregs->interrupt_bitmap, vcpu->irq_pending,
7035
- sizeof sregs->interrupt_bitmap);
7042
-static void set_segment(struct kvm_vcpu *vcpu,
7043
- struct kvm_segment *var, int seg)
7045
- return kvm_x86_ops->set_segment(vcpu, var, seg);
7048
-static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
7049
- struct kvm_sregs *sregs)
7051
- int mmu_reset_needed = 0;
7052
- int i, pending_vec, max_bits;
7053
- struct descriptor_table dt;
7057
- dt.limit = sregs->idt.limit;
7058
- dt.base = sregs->idt.base;
7059
- kvm_x86_ops->set_idt(vcpu, &dt);
7060
- dt.limit = sregs->gdt.limit;
7061
- dt.base = sregs->gdt.base;
7062
- kvm_x86_ops->set_gdt(vcpu, &dt);
7064
- vcpu->cr2 = sregs->cr2;
7065
- mmu_reset_needed |= vcpu->cr3 != sregs->cr3;
7066
- vcpu->cr3 = sregs->cr3;
7068
- set_cr8(vcpu, sregs->cr8);
7070
- mmu_reset_needed |= vcpu->shadow_efer != sregs->efer;
7071
-#ifdef CONFIG_X86_64
7072
- kvm_x86_ops->set_efer(vcpu, sregs->efer);
7074
- kvm_set_apic_base(vcpu, sregs->apic_base);
7076
- kvm_x86_ops->decache_cr4_guest_bits(vcpu);
7078
- mmu_reset_needed |= vcpu->cr0 != sregs->cr0;
7079
- vcpu->cr0 = sregs->cr0;
7080
- kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
7082
- mmu_reset_needed |= vcpu->cr4 != sregs->cr4;
7083
- kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
7084
- if (!is_long_mode(vcpu) && is_pae(vcpu))
7085
- load_pdptrs(vcpu, vcpu->cr3);
7087
- if (mmu_reset_needed)
7088
- kvm_mmu_reset_context(vcpu);
7090
- if (!irqchip_in_kernel(vcpu->kvm)) {
7091
- memcpy(vcpu->irq_pending, sregs->interrupt_bitmap,
7092
- sizeof vcpu->irq_pending);
7093
- vcpu->irq_summary = 0;
7094
- for (i = 0; i < ARRAY_SIZE(vcpu->irq_pending); ++i)
7095
- if (vcpu->irq_pending[i])
7096
- __set_bit(i, &vcpu->irq_summary);
7098
- max_bits = (sizeof sregs->interrupt_bitmap) << 3;
7099
- pending_vec = find_first_bit(
7100
- (const unsigned long *)sregs->interrupt_bitmap,
7102
- /* Only pending external irq is handled here */
7103
- if (pending_vec < max_bits) {
7104
- kvm_x86_ops->set_irq(vcpu, pending_vec);
7105
- printk("Set back pending irq %d\n", pending_vec);
7109
- set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
7110
- set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
7111
- set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
7112
- set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
7113
- set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
7114
- set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
7116
- set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
7117
- set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
7124
-void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
7126
- struct kvm_segment cs;
7128
- get_segment(vcpu, &cs, VCPU_SREG_CS);
7132
-EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
7135
- * List of msr numbers which we expose to userspace through KVM_GET_MSRS
7136
- * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
7138
- * This list is modified at module load time to reflect the
7139
- * capabilities of the host cpu.
7141
-static u32 msrs_to_save[] = {
7142
- MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
7144
-#ifdef CONFIG_X86_64
7145
- MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
7147
- MSR_IA32_TIME_STAMP_COUNTER,
7150
-static unsigned num_msrs_to_save;
7152
-static u32 emulated_msrs[] = {
7153
- MSR_IA32_MISC_ENABLE,
7156
-static __init void kvm_init_msr_list(void)
7161
- for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
7162
- if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
7165
- msrs_to_save[j] = msrs_to_save[i];
7168
- num_msrs_to_save = j;
7172
- * Adapt set_msr() to msr_io()'s calling convention
7174
-static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
7176
- return kvm_set_msr(vcpu, index, *data);
7180
- * Read or write a bunch of msrs. All parameters are kernel addresses.
7182
- * @return number of msrs set successfully.
7184
-static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
7185
- struct kvm_msr_entry *entries,
7186
- int (*do_msr)(struct kvm_vcpu *vcpu,
7187
- unsigned index, u64 *data))
7193
- for (i = 0; i < msrs->nmsrs; ++i)
7194
- if (do_msr(vcpu, entries[i].index, &entries[i].data))
7203
- * Read or write a bunch of msrs. Parameters are user addresses.
7205
- * @return number of msrs set successfully.
7207
-static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
7208
- int (*do_msr)(struct kvm_vcpu *vcpu,
7209
- unsigned index, u64 *data),
7212
- struct kvm_msrs msrs;
7213
- struct kvm_msr_entry *entries;
7218
- if (copy_from_user(&msrs, user_msrs, sizeof msrs))
7222
- if (msrs.nmsrs >= MAX_IO_MSRS)
7226
- size = sizeof(struct kvm_msr_entry) * msrs.nmsrs;
7227
- entries = vmalloc(size);
7232
- if (copy_from_user(entries, user_msrs->entries, size))
7235
- r = n = __msr_io(vcpu, &msrs, entries, do_msr);
7240
- if (writeback && copy_to_user(user_msrs->entries, entries, size))
7252
- * Translate a guest virtual address to a guest physical address.
7254
-static int kvm_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
7255
- struct kvm_translation *tr)
7257
- unsigned long vaddr = tr->linear_address;
7261
- mutex_lock(&vcpu->kvm->lock);
7262
- gpa = vcpu->mmu.gva_to_gpa(vcpu, vaddr);
7263
- tr->physical_address = gpa;
7264
- tr->valid = gpa != UNMAPPED_GVA;
7265
- tr->writeable = 1;
7267
- mutex_unlock(&vcpu->kvm->lock);
7273
-static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
7274
- struct kvm_interrupt *irq)
7276
- if (irq->irq < 0 || irq->irq >= 256)
7278
- if (irqchip_in_kernel(vcpu->kvm))
7282
- set_bit(irq->irq, vcpu->irq_pending);
7283
- set_bit(irq->irq / BITS_PER_LONG, &vcpu->irq_summary);
7290
-static int kvm_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu,
7291
- struct kvm_debug_guest *dbg)
7297
- r = kvm_x86_ops->set_guest_debug(vcpu, dbg);
7304
static struct page *kvm_vcpu_nopage(struct vm_area_struct *vma,
7305
unsigned long address,
7307
@@ -2608,26 +740,21 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
7311
- vcpu = kvm_x86_ops->vcpu_create(kvm, n);
7312
+ vcpu = kvm_arch_vcpu_create(kvm, n);
7314
return PTR_ERR(vcpu);
7316
preempt_notifier_init(&vcpu->preempt_notifier, &kvm_preempt_ops);
7318
- /* We do fxsave: this must be aligned. */
7319
- BUG_ON((unsigned long)&vcpu->host_fx_image & 0xF);
7322
- r = kvm_mmu_setup(vcpu);
7326
+ r = kvm_arch_vcpu_setup(vcpu);
7328
+ goto vcpu_destroy;
7330
mutex_lock(&kvm->lock);
7331
if (kvm->vcpus[n]) {
7333
mutex_unlock(&kvm->lock);
7335
+ goto vcpu_destroy;
7337
kvm->vcpus[n] = vcpu;
7338
mutex_unlock(&kvm->lock);
7339
@@ -2642,56 +769,8 @@ unlink:
7340
mutex_lock(&kvm->lock);
7341
kvm->vcpus[n] = NULL;
7342
mutex_unlock(&kvm->lock);
7346
- kvm_mmu_unload(vcpu);
7350
- kvm_x86_ops->vcpu_free(vcpu);
7354
-static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
7358
- struct kvm_cpuid_entry *e, *entry;
7360
- rdmsrl(MSR_EFER, efer);
7362
- for (i = 0; i < vcpu->cpuid_nent; ++i) {
7363
- e = &vcpu->cpuid_entries[i];
7364
- if (e->function == 0x80000001) {
7369
- if (entry && (entry->edx & (1 << 20)) && !(efer & EFER_NX)) {
7370
- entry->edx &= ~(1 << 20);
7371
- printk(KERN_INFO "kvm: guest NX capability removed\n");
7375
-static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
7376
- struct kvm_cpuid *cpuid,
7377
- struct kvm_cpuid_entry __user *entries)
7382
- if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
7385
- if (copy_from_user(&vcpu->cpuid_entries, entries,
7386
- cpuid->nent * sizeof(struct kvm_cpuid_entry)))
7388
- vcpu->cpuid_nent = cpuid->nent;
7389
- cpuid_fix_nx_cap(vcpu);
7394
+ kvm_arch_vcpu_destroy(vcpu);
7398
@@ -2706,107 +785,27 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
7403
- * fxsave fpu state. Taken from x86_64/processor.h. To be killed when
7404
- * we have asm/x86/processor.h
7415
- u32 st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
7416
-#ifdef CONFIG_X86_64
7417
- u32 xmm_space[64]; /* 16*16 bytes for each XMM-reg = 256 bytes */
7419
- u32 xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
7423
-static int kvm_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
7425
- struct fxsave *fxsave = (struct fxsave *)&vcpu->guest_fx_image;
7429
- memcpy(fpu->fpr, fxsave->st_space, 128);
7430
- fpu->fcw = fxsave->cwd;
7431
- fpu->fsw = fxsave->swd;
7432
- fpu->ftwx = fxsave->twd;
7433
- fpu->last_opcode = fxsave->fop;
7434
- fpu->last_ip = fxsave->rip;
7435
- fpu->last_dp = fxsave->rdp;
7436
- memcpy(fpu->xmm, fxsave->xmm_space, sizeof fxsave->xmm_space);
7443
-static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
7445
- struct fxsave *fxsave = (struct fxsave *)&vcpu->guest_fx_image;
7449
- memcpy(fxsave->st_space, fpu->fpr, 128);
7450
- fxsave->cwd = fpu->fcw;
7451
- fxsave->swd = fpu->fsw;
7452
- fxsave->twd = fpu->ftwx;
7453
- fxsave->fop = fpu->last_opcode;
7454
- fxsave->rip = fpu->last_ip;
7455
- fxsave->rdp = fpu->last_dp;
7456
- memcpy(fxsave->xmm_space, fpu->xmm, sizeof fxsave->xmm_space);
7463
-static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
7464
- struct kvm_lapic_state *s)
7467
- memcpy(s->regs, vcpu->apic->regs, sizeof *s);
7473
-static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu,
7474
- struct kvm_lapic_state *s)
7477
- memcpy(vcpu->apic->regs, s->regs, sizeof *s);
7478
- kvm_apic_post_state_restore(vcpu);
7484
static long kvm_vcpu_ioctl(struct file *filp,
7485
unsigned int ioctl, unsigned long arg)
7487
struct kvm_vcpu *vcpu = filp->private_data;
7488
void __user *argp = (void __user *)arg;
7492
+ if (vcpu->kvm->mm != current->mm)
7499
- r = kvm_vcpu_ioctl_run(vcpu, vcpu->run);
7500
+ r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu->run);
7502
case KVM_GET_REGS: {
7503
struct kvm_regs kvm_regs;
7505
memset(&kvm_regs, 0, sizeof kvm_regs);
7506
- r = kvm_vcpu_ioctl_get_regs(vcpu, &kvm_regs);
7507
+ r = kvm_arch_vcpu_ioctl_get_regs(vcpu, &kvm_regs);
7511
@@ -2821,7 +820,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
7513
if (copy_from_user(&kvm_regs, argp, sizeof kvm_regs))
7515
- r = kvm_vcpu_ioctl_set_regs(vcpu, &kvm_regs);
7516
+ r = kvm_arch_vcpu_ioctl_set_regs(vcpu, &kvm_regs);
7520
@@ -2831,7 +830,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
7521
struct kvm_sregs kvm_sregs;
7523
memset(&kvm_sregs, 0, sizeof kvm_sregs);
7524
- r = kvm_vcpu_ioctl_get_sregs(vcpu, &kvm_sregs);
7525
+ r = kvm_arch_vcpu_ioctl_get_sregs(vcpu, &kvm_sregs);
7529
@@ -2846,7 +845,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
7531
if (copy_from_user(&kvm_sregs, argp, sizeof kvm_sregs))
7533
- r = kvm_vcpu_ioctl_set_sregs(vcpu, &kvm_sregs);
7534
+ r = kvm_arch_vcpu_ioctl_set_sregs(vcpu, &kvm_sregs);
7538
@@ -2858,7 +857,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
7540
if (copy_from_user(&tr, argp, sizeof tr))
7542
- r = kvm_vcpu_ioctl_translate(vcpu, &tr);
7543
+ r = kvm_arch_vcpu_ioctl_translate(vcpu, &tr);
7547
@@ -2867,48 +866,18 @@ static long kvm_vcpu_ioctl(struct file *filp,
7551
- case KVM_INTERRUPT: {
7552
- struct kvm_interrupt irq;
7555
- if (copy_from_user(&irq, argp, sizeof irq))
7557
- r = kvm_vcpu_ioctl_interrupt(vcpu, &irq);
7563
case KVM_DEBUG_GUEST: {
7564
struct kvm_debug_guest dbg;
7567
if (copy_from_user(&dbg, argp, sizeof dbg))
7569
- r = kvm_vcpu_ioctl_debug_guest(vcpu, &dbg);
7570
+ r = kvm_arch_vcpu_ioctl_debug_guest(vcpu, &dbg);
7576
- case KVM_GET_MSRS:
7577
- r = msr_io(vcpu, argp, kvm_get_msr, 1);
7579
- case KVM_SET_MSRS:
7580
- r = msr_io(vcpu, argp, do_set_msr, 0);
7582
- case KVM_SET_CPUID: {
7583
- struct kvm_cpuid __user *cpuid_arg = argp;
7584
- struct kvm_cpuid cpuid;
7587
- if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
7589
- r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
7594
case KVM_SET_SIGNAL_MASK: {
7595
struct kvm_signal_mask __user *sigmask_arg = argp;
7596
struct kvm_signal_mask kvm_sigmask;
7597
@@ -2936,7 +905,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
7600
memset(&fpu, 0, sizeof fpu);
7601
- r = kvm_vcpu_ioctl_get_fpu(vcpu, &fpu);
7602
+ r = kvm_arch_vcpu_ioctl_get_fpu(vcpu, &fpu);
7606
@@ -2951,39 +920,14 @@ static long kvm_vcpu_ioctl(struct file *filp,
7608
if (copy_from_user(&fpu, argp, sizeof fpu))
7610
- r = kvm_vcpu_ioctl_set_fpu(vcpu, &fpu);
7616
- case KVM_GET_LAPIC: {
7617
- struct kvm_lapic_state lapic;
7619
- memset(&lapic, 0, sizeof lapic);
7620
- r = kvm_vcpu_ioctl_get_lapic(vcpu, &lapic);
7624
- if (copy_to_user(argp, &lapic, sizeof lapic))
7629
- case KVM_SET_LAPIC: {
7630
- struct kvm_lapic_state lapic;
7633
- if (copy_from_user(&lapic, argp, sizeof lapic))
7635
- r = kvm_vcpu_ioctl_set_lapic(vcpu, &lapic);;
7636
+ r = kvm_arch_vcpu_ioctl_set_fpu(vcpu, &fpu);
7644
+ r = kvm_arch_vcpu_ioctl(filp, ioctl, arg);
7648
@@ -2994,21 +938,25 @@ static long kvm_vm_ioctl(struct file *filp,
7650
struct kvm *kvm = filp->private_data;
7651
void __user *argp = (void __user *)arg;
7655
+ if (kvm->mm != current->mm)
7658
case KVM_CREATE_VCPU:
7659
r = kvm_vm_ioctl_create_vcpu(kvm, arg);
7663
- case KVM_SET_MEMORY_REGION: {
7664
- struct kvm_memory_region kvm_mem;
7665
+ case KVM_SET_USER_MEMORY_REGION: {
7666
+ struct kvm_userspace_memory_region kvm_userspace_mem;
7669
- if (copy_from_user(&kvm_mem, argp, sizeof kvm_mem))
7670
+ if (copy_from_user(&kvm_userspace_mem, argp,
7671
+ sizeof kvm_userspace_mem))
7673
- r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_mem);
7675
+ r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem, 1);
7679
@@ -3024,88 +972,8 @@ static long kvm_vm_ioctl(struct file *filp,
7683
- case KVM_SET_MEMORY_ALIAS: {
7684
- struct kvm_memory_alias alias;
7687
- if (copy_from_user(&alias, argp, sizeof alias))
7689
- r = kvm_vm_ioctl_set_memory_alias(kvm, &alias);
7694
- case KVM_CREATE_IRQCHIP:
7696
- kvm->vpic = kvm_create_pic(kvm);
7698
- r = kvm_ioapic_init(kvm);
7708
- case KVM_IRQ_LINE: {
7709
- struct kvm_irq_level irq_event;
7712
- if (copy_from_user(&irq_event, argp, sizeof irq_event))
7714
- if (irqchip_in_kernel(kvm)) {
7715
- mutex_lock(&kvm->lock);
7716
- if (irq_event.irq < 16)
7717
- kvm_pic_set_irq(pic_irqchip(kvm),
7720
- kvm_ioapic_set_irq(kvm->vioapic,
7723
- mutex_unlock(&kvm->lock);
7728
- case KVM_GET_IRQCHIP: {
7729
- /* 0: PIC master, 1: PIC slave, 2: IOAPIC */
7730
- struct kvm_irqchip chip;
7733
- if (copy_from_user(&chip, argp, sizeof chip))
7736
- if (!irqchip_in_kernel(kvm))
7738
- r = kvm_vm_ioctl_get_irqchip(kvm, &chip);
7742
- if (copy_to_user(argp, &chip, sizeof chip))
7747
- case KVM_SET_IRQCHIP: {
7748
- /* 0: PIC master, 1: PIC slave, 2: IOAPIC */
7749
- struct kvm_irqchip chip;
7752
- if (copy_from_user(&chip, argp, sizeof chip))
7755
- if (!irqchip_in_kernel(kvm))
7757
- r = kvm_vm_ioctl_set_irqchip(kvm, &chip);
7765
+ r = kvm_arch_vm_ioctl(filp, ioctl, arg);
7769
@@ -3120,10 +988,14 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
7772
pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
7773
- page = gfn_to_page(kvm, pgoff);
7775
+ if (!kvm_is_visible_gfn(kvm, pgoff))
7776
return NOPAGE_SIGBUS;
7778
+ /* current->mm->mmap_sem is already held so call lockless version */
7779
+ page = __gfn_to_page(kvm, pgoff);
7780
+ if (is_error_page(page)) {
7781
+ kvm_release_page_clean(page);
7782
+ return NOPAGE_SIGBUS;
7785
*type = VM_FAULT_MINOR;
7787
@@ -3187,47 +1059,9 @@ static long kvm_dev_ioctl(struct file *filp,
7789
r = kvm_dev_ioctl_create_vm();
7791
- case KVM_GET_MSR_INDEX_LIST: {
7792
- struct kvm_msr_list __user *user_msr_list = argp;
7793
- struct kvm_msr_list msr_list;
7797
- if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
7799
- n = msr_list.nmsrs;
7800
- msr_list.nmsrs = num_msrs_to_save + ARRAY_SIZE(emulated_msrs);
7801
- if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
7804
- if (n < num_msrs_to_save)
7807
- if (copy_to_user(user_msr_list->indices, &msrs_to_save,
7808
- num_msrs_to_save * sizeof(u32)))
7810
- if (copy_to_user(user_msr_list->indices
7811
- + num_msrs_to_save * sizeof(u32),
7813
- ARRAY_SIZE(emulated_msrs) * sizeof(u32)))
7816
+ case KVM_CHECK_EXTENSION:
7817
+ r = kvm_dev_ioctl_check_extension((long)argp);
7820
- case KVM_CHECK_EXTENSION: {
7821
- int ext = (long)argp;
7824
- case KVM_CAP_IRQCHIP:
7834
case KVM_GET_VCPU_MMAP_SIZE:
7837
@@ -3235,7 +1069,7 @@ static long kvm_dev_ioctl(struct file *filp,
7842
+ return kvm_arch_dev_ioctl(filp, ioctl, arg);
7846
@@ -3252,41 +1086,6 @@ static struct miscdevice kvm_dev = {
7851
- * Make sure that a cpu that is being hot-unplugged does not have any vcpus
7854
-static void decache_vcpus_on_cpu(int cpu)
7857
- struct kvm_vcpu *vcpu;
7860
- spin_lock(&kvm_lock);
7861
- list_for_each_entry(vm, &vm_list, vm_list)
7862
- for (i = 0; i < KVM_MAX_VCPUS; ++i) {
7863
- vcpu = vm->vcpus[i];
7867
- * If the vcpu is locked, then it is running on some
7868
- * other cpu and therefore it is not cached on the
7869
- * cpu in question.
7871
- * If it's not locked, check the last cpu it executed
7874
- if (mutex_trylock(&vcpu->mutex)) {
7875
- if (vcpu->cpu == cpu) {
7876
- kvm_x86_ops->vcpu_decache(vcpu);
7879
- mutex_unlock(&vcpu->mutex);
7882
- spin_unlock(&kvm_lock);
7885
static void hardware_enable(void *junk)
7887
int cpu = raw_smp_processor_id();
7888
@@ -3294,7 +1093,7 @@ static void hardware_enable(void *junk)
7889
if (cpu_isset(cpu, cpus_hardware_enabled))
7891
cpu_set(cpu, cpus_hardware_enabled);
7892
- kvm_x86_ops->hardware_enable(NULL);
7893
+ kvm_arch_hardware_enable(NULL);
7896
static void hardware_disable(void *junk)
7897
@@ -3305,7 +1104,7 @@ static void hardware_disable(void *junk)
7899
cpu_clear(cpu, cpus_hardware_enabled);
7900
decache_vcpus_on_cpu(cpu);
7901
- kvm_x86_ops->hardware_disable(NULL);
7902
+ kvm_arch_hardware_disable(NULL);
7905
static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
7906
@@ -3313,21 +1112,19 @@ static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
7910
+ val &= ~CPU_TASKS_FROZEN;
7913
- case CPU_DYING_FROZEN:
7914
printk(KERN_INFO "kvm: disabling virtualization on CPU%d\n",
7916
hardware_disable(NULL);
7918
case CPU_UP_CANCELED:
7919
- case CPU_UP_CANCELED_FROZEN:
7920
printk(KERN_INFO "kvm: disabling virtualization on CPU%d\n",
7922
smp_call_function_single(cpu, hardware_disable, NULL, 0, 1);
7925
- case CPU_ONLINE_FROZEN:
7926
printk(KERN_INFO "kvm: enabling virtualization on CPU%d\n",
7928
smp_call_function_single(cpu, hardware_enable, NULL, 0, 1);
7929
@@ -3337,7 +1134,7 @@ static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
7932
static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
7936
if (val == SYS_RESTART) {
7938
@@ -3397,7 +1194,22 @@ static struct notifier_block kvm_cpu_notifier = {
7939
.priority = 20, /* must be > scheduler priority */
7942
-static u64 stat_get(void *_offset)
7943
+static u64 vm_stat_get(void *_offset)
7945
+ unsigned offset = (long)_offset;
7949
+ spin_lock(&kvm_lock);
7950
+ list_for_each_entry(kvm, &vm_list, vm_list)
7951
+ total += *(u32 *)((void *)kvm + offset);
7952
+ spin_unlock(&kvm_lock);
7956
+DEFINE_SIMPLE_ATTRIBUTE(vm_stat_fops, vm_stat_get, NULL, "%llu\n");
7958
+static u64 vcpu_stat_get(void *_offset)
7960
unsigned offset = (long)_offset;
7962
@@ -3416,9 +1228,14 @@ static u64 stat_get(void *_offset)
7966
-DEFINE_SIMPLE_ATTRIBUTE(stat_fops, stat_get, NULL, "%llu\n");
7967
+DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_fops, vcpu_stat_get, NULL, "%llu\n");
7969
+static struct file_operations *stat_fops[] = {
7970
+ [KVM_STAT_VCPU] = &vcpu_stat_fops,
7971
+ [KVM_STAT_VM] = &vm_stat_fops,
7974
-static __init void kvm_init_debug(void)
7975
+static void kvm_init_debug(void)
7977
struct kvm_stats_debugfs_item *p;
7979
@@ -3426,7 +1243,7 @@ static __init void kvm_init_debug(void)
7980
for (p = debugfs_entries; p->name; ++p)
7981
p->dentry = debugfs_create_file(p->name, 0444, debugfs_dir,
7982
(void *)(long)p->offset,
7984
+ stat_fops[p->kind]);
7987
static void kvm_exit_debug(void)
7988
@@ -3461,7 +1278,7 @@ static struct sys_device kvm_sysdev = {
7989
.cls = &kvm_sysdev_class,
7992
-hpa_t bad_page_address;
7993
+struct page *bad_page;
7996
struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
7997
@@ -3473,7 +1290,7 @@ static void kvm_sched_in(struct preempt_notifier *pn, int cpu)
7999
struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
8001
- kvm_x86_ops->vcpu_load(vcpu, cpu);
8002
+ kvm_arch_vcpu_load(vcpu, cpu);
8005
static void kvm_sched_out(struct preempt_notifier *pn,
8006
@@ -3481,97 +1298,100 @@ static void kvm_sched_out(struct preempt_notifier *pn,
8008
struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
8010
- kvm_x86_ops->vcpu_put(vcpu);
8011
+ kvm_arch_vcpu_put(vcpu);
8014
-int kvm_init_x86(struct kvm_x86_ops *ops, unsigned int vcpu_size,
8015
+int kvm_init(void *opaque, unsigned int vcpu_size,
8016
struct module *module)
8021
- if (kvm_x86_ops) {
8022
- printk(KERN_ERR "kvm: already loaded the other module\n");
8027
- if (!ops->cpu_has_kvm_support()) {
8028
- printk(KERN_ERR "kvm: no hardware support\n");
8029
- return -EOPNOTSUPP;
8031
- if (ops->disabled_by_bios()) {
8032
- printk(KERN_ERR "kvm: disabled by bios\n");
8033
- return -EOPNOTSUPP;
8035
+ r = kvm_arch_init(opaque);
8039
- kvm_x86_ops = ops;
8040
+ bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
8042
- r = kvm_x86_ops->hardware_setup();
8044
+ if (bad_page == NULL) {
8049
+ r = kvm_arch_hardware_setup();
8053
for_each_online_cpu(cpu) {
8054
smp_call_function_single(cpu,
8055
- kvm_x86_ops->check_processor_compatibility,
8056
+ kvm_arch_check_processor_compat,
8063
on_each_cpu(hardware_enable, NULL, 0, 1);
8064
r = register_cpu_notifier(&kvm_cpu_notifier);
8068
register_reboot_notifier(&kvm_reboot_notifier);
8070
r = sysdev_class_register(&kvm_sysdev_class);
8075
r = sysdev_register(&kvm_sysdev);
8080
/* A kmem cache lets us meet the alignment requirements of fx_save. */
8081
kvm_vcpu_cache = kmem_cache_create("kvm_vcpu", vcpu_size,
8082
- __alignof__(struct kvm_vcpu), 0, 0);
8083
+ __alignof__(struct kvm_vcpu),
8085
if (!kvm_vcpu_cache) {
8091
kvm_chardev_ops.owner = module;
8093
r = misc_register(&kvm_dev);
8095
- printk (KERN_ERR "kvm: misc device register failed\n");
8096
+ printk(KERN_ERR "kvm: misc device register failed\n");
8100
kvm_preempt_ops.sched_in = kvm_sched_in;
8101
kvm_preempt_ops.sched_out = kvm_sched_out;
8107
kmem_cache_destroy(kvm_vcpu_cache);
8110
sysdev_unregister(&kvm_sysdev);
8113
sysdev_class_unregister(&kvm_sysdev_class);
8116
unregister_reboot_notifier(&kvm_reboot_notifier);
8117
unregister_cpu_notifier(&kvm_cpu_notifier);
8120
on_each_cpu(hardware_disable, NULL, 0, 1);
8122
+ kvm_arch_hardware_unsetup();
8124
- kvm_x86_ops->hardware_unsetup();
8125
+ __free_page(bad_page);
8127
- kvm_x86_ops = NULL;
8133
+EXPORT_SYMBOL_GPL(kvm_init);
8135
-void kvm_exit_x86(void)
8136
+void kvm_exit(void)
8138
misc_deregister(&kvm_dev);
8139
kmem_cache_destroy(kvm_vcpu_cache);
8140
@@ -3580,49 +1400,9 @@ void kvm_exit_x86(void)
8141
unregister_reboot_notifier(&kvm_reboot_notifier);
8142
unregister_cpu_notifier(&kvm_cpu_notifier);
8143
on_each_cpu(hardware_disable, NULL, 0, 1);
8144
- kvm_x86_ops->hardware_unsetup();
8145
- kvm_x86_ops = NULL;
8148
-static __init int kvm_init(void)
8150
- static struct page *bad_page;
8153
- r = kvm_mmu_module_init();
8159
- kvm_init_msr_list();
8161
- if ((bad_page = alloc_page(GFP_KERNEL)) == NULL) {
8166
- bad_page_address = page_to_pfn(bad_page) << PAGE_SHIFT;
8167
- memset(__va(bad_page_address), 0, PAGE_SIZE);
8172
+ kvm_arch_hardware_unsetup();
8175
- kvm_mmu_module_exit();
8178
+ __free_page(bad_page);
8181
-static __exit void kvm_exit(void)
8184
- __free_page(pfn_to_page(bad_page_address >> PAGE_SHIFT));
8185
- kvm_mmu_module_exit();
8188
-module_init(kvm_init)
8189
-module_exit(kvm_exit)
8191
-EXPORT_SYMBOL_GPL(kvm_init_x86);
8192
-EXPORT_SYMBOL_GPL(kvm_exit_x86);
8193
+EXPORT_SYMBOL_GPL(kvm_exit);
8194
diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
8195
index 238fcad..5efa6c0 100644
8196
--- a/drivers/kvm/lapic.c
8197
+++ b/drivers/kvm/lapic.c
8204
#include <linux/kvm.h>
8205
#include <linux/mm.h>
8206
#include <linux/highmem.h>
8207
@@ -172,7 +174,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
8209
int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
8211
- struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
8212
+ struct kvm_lapic *apic = vcpu->apic;
8216
@@ -183,8 +185,10 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
8218
EXPORT_SYMBOL_GPL(kvm_lapic_find_highest_irr);
8220
-int kvm_apic_set_irq(struct kvm_lapic *apic, u8 vec, u8 trig)
8221
+int kvm_apic_set_irq(struct kvm_vcpu *vcpu, u8 vec, u8 trig)
8223
+ struct kvm_lapic *apic = vcpu->apic;
8225
if (!apic_test_and_set_irr(vec, apic)) {
8226
/* a new pending irq is set in IRR */
8228
@@ -392,13 +396,12 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
8232
-struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
8233
+static struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
8234
unsigned long bitmap)
8239
- struct kvm_lapic *apic;
8240
+ struct kvm_lapic *apic = NULL;
8242
last = kvm->round_robin_prev_vcpu;
8244
@@ -415,18 +418,23 @@ struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
8245
} while (next != last);
8246
kvm->round_robin_prev_vcpu = next;
8249
- vcpu_id = ffs(bitmap) - 1;
8250
- if (vcpu_id < 0) {
8252
- printk(KERN_DEBUG "vcpu not ready for apic_round_robin\n");
8254
- apic = kvm->vcpus[vcpu_id]->apic;
8257
+ printk(KERN_DEBUG "vcpu not ready for apic_round_robin\n");
8262
+struct kvm_vcpu *kvm_get_lowest_prio_vcpu(struct kvm *kvm, u8 vector,
8263
+ unsigned long bitmap)
8265
+ struct kvm_lapic *apic;
8267
+ apic = kvm_apic_round_robin(kvm, vector, bitmap);
8269
+ return apic->vcpu;
8273
static void apic_set_eoi(struct kvm_lapic *apic)
8275
int vector = apic_find_highest_isr(apic);
8276
@@ -458,7 +466,7 @@ static void apic_send_ipi(struct kvm_lapic *apic)
8277
unsigned int delivery_mode = icr_low & APIC_MODE_MASK;
8278
unsigned int vector = icr_low & APIC_VECTOR_MASK;
8280
- struct kvm_lapic *target;
8281
+ struct kvm_vcpu *target;
8282
struct kvm_vcpu *vcpu;
8283
unsigned long lpr_map = 0;
8285
@@ -485,9 +493,9 @@ static void apic_send_ipi(struct kvm_lapic *apic)
8288
if (delivery_mode == APIC_DM_LOWEST) {
8289
- target = kvm_apic_round_robin(vcpu->kvm, vector, lpr_map);
8290
+ target = kvm_get_lowest_prio_vcpu(vcpu->kvm, vector, lpr_map);
8292
- __apic_accept_irq(target, delivery_mode,
8293
+ __apic_accept_irq(target->apic, delivery_mode,
8294
vector, level, trig_mode);
8297
@@ -762,19 +770,17 @@ static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr)
8301
-void kvm_free_apic(struct kvm_lapic *apic)
8302
+void kvm_free_lapic(struct kvm_vcpu *vcpu)
8308
- hrtimer_cancel(&apic->timer.dev);
8309
+ hrtimer_cancel(&vcpu->apic->timer.dev);
8311
- if (apic->regs_page) {
8312
- __free_page(apic->regs_page);
8313
- apic->regs_page = 0;
8315
+ if (vcpu->apic->regs_page)
8316
+ __free_page(vcpu->apic->regs_page);
8319
+ kfree(vcpu->apic);
8323
@@ -785,7 +791,7 @@ void kvm_free_apic(struct kvm_lapic *apic)
8325
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
8327
- struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
8328
+ struct kvm_lapic *apic = vcpu->apic;
8332
@@ -794,7 +800,7 @@ void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
8334
u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
8336
- struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
8337
+ struct kvm_lapic *apic = vcpu->apic;
8341
@@ -807,7 +813,7 @@ EXPORT_SYMBOL_GPL(kvm_lapic_get_cr8);
8343
void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
8345
- struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
8346
+ struct kvm_lapic *apic = vcpu->apic;
8349
value |= MSR_IA32_APICBASE_BSP;
8350
@@ -884,7 +890,7 @@ EXPORT_SYMBOL_GPL(kvm_lapic_reset);
8352
int kvm_lapic_enabled(struct kvm_vcpu *vcpu)
8354
- struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
8355
+ struct kvm_lapic *apic = vcpu->apic;
8359
@@ -908,8 +914,7 @@ static int __apic_timer_fn(struct kvm_lapic *apic)
8360
wait_queue_head_t *q = &apic->vcpu->wq;
8362
atomic_inc(&apic->timer.pending);
8363
- if (waitqueue_active(q))
8365
+ if (waitqueue_active(q)) {
8366
apic->vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
8367
wake_up_interruptible(q);
8369
@@ -962,7 +967,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
8370
if (apic->regs_page == NULL) {
8371
printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
8374
+ goto nomem_free_apic;
8376
apic->regs = page_address(apic->regs_page);
8377
memset(apic->regs, 0, PAGE_SIZE);
8378
@@ -980,8 +985,9 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
8379
apic->dev.private = apic;
8385
- kvm_free_apic(apic);
8388
EXPORT_SYMBOL_GPL(kvm_create_lapic);
8389
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
8390
index feb5ac9..9b9d1b6 100644
8391
--- a/drivers/kvm/mmu.c
8392
+++ b/drivers/kvm/mmu.c
8399
#include <linux/types.h>
8400
#include <linux/string.h>
8401
#include <linux/mm.h>
8402
#include <linux/highmem.h>
8403
#include <linux/module.h>
8404
+#include <linux/swap.h>
8406
#include <asm/page.h>
8407
#include <asm/cmpxchg.h>
8408
+#include <asm/io.h>
8412
@@ -90,7 +93,8 @@ static int dbg = 1;
8414
#define PT32_DIR_PSE36_SIZE 4
8415
#define PT32_DIR_PSE36_SHIFT 13
8416
-#define PT32_DIR_PSE36_MASK (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT)
8417
+#define PT32_DIR_PSE36_MASK \
8418
+ (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT)
8421
#define PT_FIRST_AVAIL_BITS_SHIFT 9
8422
@@ -103,7 +107,7 @@ static int dbg = 1;
8423
#define PT64_LEVEL_BITS 9
8425
#define PT64_LEVEL_SHIFT(level) \
8426
- ( PAGE_SHIFT + (level - 1) * PT64_LEVEL_BITS )
8427
+ (PAGE_SHIFT + (level - 1) * PT64_LEVEL_BITS)
8429
#define PT64_LEVEL_MASK(level) \
8430
(((1ULL << PT64_LEVEL_BITS) - 1) << PT64_LEVEL_SHIFT(level))
8431
@@ -115,7 +119,7 @@ static int dbg = 1;
8432
#define PT32_LEVEL_BITS 10
8434
#define PT32_LEVEL_SHIFT(level) \
8435
- ( PAGE_SHIFT + (level - 1) * PT32_LEVEL_BITS )
8436
+ (PAGE_SHIFT + (level - 1) * PT32_LEVEL_BITS)
8438
#define PT32_LEVEL_MASK(level) \
8439
(((1ULL << PT32_LEVEL_BITS) - 1) << PT32_LEVEL_SHIFT(level))
8440
@@ -132,6 +136,8 @@ static int dbg = 1;
8441
#define PT32_DIR_BASE_ADDR_MASK \
8442
(PAGE_MASK & ~((1ULL << (PAGE_SHIFT + PT32_LEVEL_BITS)) - 1))
8444
+#define PT64_PERM_MASK (PT_PRESENT_MASK | PT_WRITABLE_MASK | PT_USER_MASK \
8447
#define PFERR_PRESENT_MASK (1U << 0)
8448
#define PFERR_WRITE_MASK (1U << 1)
8449
@@ -156,6 +162,16 @@ static struct kmem_cache *pte_chain_cache;
8450
static struct kmem_cache *rmap_desc_cache;
8451
static struct kmem_cache *mmu_page_header_cache;
8453
+static u64 __read_mostly shadow_trap_nonpresent_pte;
8454
+static u64 __read_mostly shadow_notrap_nonpresent_pte;
8456
+void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte)
8458
+ shadow_trap_nonpresent_pte = trap_pte;
8459
+ shadow_notrap_nonpresent_pte = notrap_pte;
8461
+EXPORT_SYMBOL_GPL(kvm_mmu_set_nonpresent_ptes);
8463
static int is_write_protection(struct kvm_vcpu *vcpu)
8465
return vcpu->cr0 & X86_CR0_WP;
8466
@@ -176,11 +192,23 @@ static int is_present_pte(unsigned long pte)
8467
return pte & PT_PRESENT_MASK;
8470
+static int is_shadow_present_pte(u64 pte)
8472
+ pte &= ~PT_SHADOW_IO_MARK;
8473
+ return pte != shadow_trap_nonpresent_pte
8474
+ && pte != shadow_notrap_nonpresent_pte;
8477
static int is_writeble_pte(unsigned long pte)
8479
return pte & PT_WRITABLE_MASK;
8482
+static int is_dirty_pte(unsigned long pte)
8484
+ return pte & PT_DIRTY_MASK;
8487
static int is_io_pte(unsigned long pte)
8489
return pte & PT_SHADOW_IO_MARK;
8490
@@ -188,8 +216,15 @@ static int is_io_pte(unsigned long pte)
8492
static int is_rmap_pte(u64 pte)
8494
- return (pte & (PT_WRITABLE_MASK | PT_PRESENT_MASK))
8495
- == (PT_WRITABLE_MASK | PT_PRESENT_MASK);
8496
+ return pte != shadow_trap_nonpresent_pte
8497
+ && pte != shadow_notrap_nonpresent_pte;
8500
+static gfn_t pse36_gfn_delta(u32 gpte)
8502
+ int shift = 32 - PT32_DIR_PSE36_SHIFT - PAGE_SHIFT;
8504
+ return (gpte & PT32_DIR_PSE36_MASK) << shift;
8507
static void set_shadow_pte(u64 *sptep, u64 spte)
8508
@@ -259,7 +294,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu)
8509
rmap_desc_cache, 1);
8512
- r = mmu_topup_memory_cache_page(&vcpu->mmu_page_cache, 4);
8513
+ r = mmu_topup_memory_cache_page(&vcpu->mmu_page_cache, 8);
8516
r = mmu_topup_memory_cache(&vcpu->mmu_page_header_cache,
8517
@@ -310,35 +345,52 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
8521
+ * Take gfn and return the reverse mapping to it.
8522
+ * Note: gfn must be unaliased before this function get called
8525
+static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn)
8527
+ struct kvm_memory_slot *slot;
8529
+ slot = gfn_to_memslot(kvm, gfn);
8530
+ return &slot->rmap[gfn - slot->base_gfn];
8534
* Reverse mapping data structures:
8536
- * If page->private bit zero is zero, then page->private points to the
8537
- * shadow page table entry that points to page_address(page).
8538
+ * If rmapp bit zero is zero, then rmapp point to the shadw page table entry
8539
+ * that points to page_address(page).
8541
- * If page->private bit zero is one, (then page->private & ~1) points
8542
- * to a struct kvm_rmap_desc containing more mappings.
8543
+ * If rmapp bit zero is one, (then rmap & ~1) points to a struct kvm_rmap_desc
8544
+ * containing more mappings.
8546
-static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte)
8547
+static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
8549
- struct page *page;
8550
+ struct kvm_mmu_page *sp;
8551
struct kvm_rmap_desc *desc;
8552
+ unsigned long *rmapp;
8555
if (!is_rmap_pte(*spte))
8557
- page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
8558
- if (!page_private(page)) {
8559
+ gfn = unalias_gfn(vcpu->kvm, gfn);
8560
+ sp = page_header(__pa(spte));
8561
+ sp->gfns[spte - sp->spt] = gfn;
8562
+ rmapp = gfn_to_rmap(vcpu->kvm, gfn);
8564
rmap_printk("rmap_add: %p %llx 0->1\n", spte, *spte);
8565
- set_page_private(page,(unsigned long)spte);
8566
- } else if (!(page_private(page) & 1)) {
8567
+ *rmapp = (unsigned long)spte;
8568
+ } else if (!(*rmapp & 1)) {
8569
rmap_printk("rmap_add: %p %llx 1->many\n", spte, *spte);
8570
desc = mmu_alloc_rmap_desc(vcpu);
8571
- desc->shadow_ptes[0] = (u64 *)page_private(page);
8572
+ desc->shadow_ptes[0] = (u64 *)*rmapp;
8573
desc->shadow_ptes[1] = spte;
8574
- set_page_private(page,(unsigned long)desc | 1);
8575
+ *rmapp = (unsigned long)desc | 1;
8577
rmap_printk("rmap_add: %p %llx many->many\n", spte, *spte);
8578
- desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
8579
+ desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
8580
while (desc->shadow_ptes[RMAP_EXT-1] && desc->more)
8582
if (desc->shadow_ptes[RMAP_EXT-1]) {
8583
@@ -351,7 +403,7 @@ static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte)
8587
-static void rmap_desc_remove_entry(struct page *page,
8588
+static void rmap_desc_remove_entry(unsigned long *rmapp,
8589
struct kvm_rmap_desc *desc,
8591
struct kvm_rmap_desc *prev_desc)
8592
@@ -365,44 +417,53 @@ static void rmap_desc_remove_entry(struct page *page,
8595
if (!prev_desc && !desc->more)
8596
- set_page_private(page,(unsigned long)desc->shadow_ptes[0]);
8597
+ *rmapp = (unsigned long)desc->shadow_ptes[0];
8600
prev_desc->more = desc->more;
8602
- set_page_private(page,(unsigned long)desc->more | 1);
8603
+ *rmapp = (unsigned long)desc->more | 1;
8604
mmu_free_rmap_desc(desc);
8607
-static void rmap_remove(u64 *spte)
8608
+static void rmap_remove(struct kvm *kvm, u64 *spte)
8610
- struct page *page;
8611
struct kvm_rmap_desc *desc;
8612
struct kvm_rmap_desc *prev_desc;
8613
+ struct kvm_mmu_page *sp;
8614
+ struct page *page;
8615
+ unsigned long *rmapp;
8618
if (!is_rmap_pte(*spte))
8620
+ sp = page_header(__pa(spte));
8621
page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
8622
- if (!page_private(page)) {
8623
+ mark_page_accessed(page);
8624
+ if (is_writeble_pte(*spte))
8625
+ kvm_release_page_dirty(page);
8627
+ kvm_release_page_clean(page);
8628
+ rmapp = gfn_to_rmap(kvm, sp->gfns[spte - sp->spt]);
8630
printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte);
8632
- } else if (!(page_private(page) & 1)) {
8633
+ } else if (!(*rmapp & 1)) {
8634
rmap_printk("rmap_remove: %p %llx 1->0\n", spte, *spte);
8635
- if ((u64 *)page_private(page) != spte) {
8636
+ if ((u64 *)*rmapp != spte) {
8637
printk(KERN_ERR "rmap_remove: %p %llx 1->BUG\n",
8641
- set_page_private(page,0);
8644
rmap_printk("rmap_remove: %p %llx many->many\n", spte, *spte);
8645
- desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
8646
+ desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
8649
for (i = 0; i < RMAP_EXT && desc->shadow_ptes[i]; ++i)
8650
if (desc->shadow_ptes[i] == spte) {
8651
- rmap_desc_remove_entry(page,
8652
+ rmap_desc_remove_entry(rmapp,
8656
@@ -414,32 +475,51 @@ static void rmap_remove(u64 *spte)
8660
-static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
8661
+static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
8663
- struct kvm *kvm = vcpu->kvm;
8664
- struct page *page;
8665
struct kvm_rmap_desc *desc;
8666
+ struct kvm_rmap_desc *prev_desc;
8672
+ else if (!(*rmapp & 1)) {
8674
+ return (u64 *)*rmapp;
8677
+ desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
8681
+ for (i = 0; i < RMAP_EXT && desc->shadow_ptes[i]; ++i) {
8682
+ if (prev_spte == spte)
8683
+ return desc->shadow_ptes[i];
8684
+ prev_spte = desc->shadow_ptes[i];
8686
+ desc = desc->more;
8691
+static void rmap_write_protect(struct kvm *kvm, u64 gfn)
8693
+ unsigned long *rmapp;
8696
- page = gfn_to_page(kvm, gfn);
8698
+ gfn = unalias_gfn(kvm, gfn);
8699
+ rmapp = gfn_to_rmap(kvm, gfn);
8701
- while (page_private(page)) {
8702
- if (!(page_private(page) & 1))
8703
- spte = (u64 *)page_private(page);
8705
- desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
8706
- spte = desc->shadow_ptes[0];
8708
+ spte = rmap_next(kvm, rmapp, NULL);
8711
- BUG_ON((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT
8712
- != page_to_pfn(page));
8713
BUG_ON(!(*spte & PT_PRESENT_MASK));
8714
- BUG_ON(!(*spte & PT_WRITABLE_MASK));
8715
rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
8716
- rmap_remove(spte);
8717
- set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
8718
- kvm_flush_remote_tlbs(vcpu->kvm);
8719
+ if (is_writeble_pte(*spte))
8720
+ set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
8721
+ kvm_flush_remote_tlbs(kvm);
8722
+ spte = rmap_next(kvm, rmapp, spte);
8726
@@ -450,7 +530,7 @@ static int is_empty_shadow_page(u64 *spt)
8729
for (pos = spt, end = pos + PAGE_SIZE / sizeof(u64); pos != end; pos++)
8731
+ if ((*pos & ~PT_SHADOW_IO_MARK) != shadow_trap_nonpresent_pte) {
8732
printk(KERN_ERR "%s: %p %llx\n", __FUNCTION__,
8735
@@ -459,13 +539,13 @@ static int is_empty_shadow_page(u64 *spt)
8739
-static void kvm_mmu_free_page(struct kvm *kvm,
8740
- struct kvm_mmu_page *page_head)
8741
+static void kvm_mmu_free_page(struct kvm *kvm, struct kvm_mmu_page *sp)
8743
- ASSERT(is_empty_shadow_page(page_head->spt));
8744
- list_del(&page_head->link);
8745
- __free_page(virt_to_page(page_head->spt));
8747
+ ASSERT(is_empty_shadow_page(sp->spt));
8748
+ list_del(&sp->link);
8749
+ __free_page(virt_to_page(sp->spt));
8750
+ __free_page(virt_to_page(sp->gfns));
8752
++kvm->n_free_mmu_pages;
8755
@@ -477,26 +557,26 @@ static unsigned kvm_page_table_hashfn(gfn_t gfn)
8756
static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
8759
- struct kvm_mmu_page *page;
8760
+ struct kvm_mmu_page *sp;
8762
if (!vcpu->kvm->n_free_mmu_pages)
8765
- page = mmu_memory_cache_alloc(&vcpu->mmu_page_header_cache,
8767
- page->spt = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
8768
- set_page_private(virt_to_page(page->spt), (unsigned long)page);
8769
- list_add(&page->link, &vcpu->kvm->active_mmu_pages);
8770
- ASSERT(is_empty_shadow_page(page->spt));
8771
- page->slot_bitmap = 0;
8772
- page->multimapped = 0;
8773
- page->parent_pte = parent_pte;
8774
+ sp = mmu_memory_cache_alloc(&vcpu->mmu_page_header_cache, sizeof *sp);
8775
+ sp->spt = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
8776
+ sp->gfns = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
8777
+ set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
8778
+ list_add(&sp->link, &vcpu->kvm->active_mmu_pages);
8779
+ ASSERT(is_empty_shadow_page(sp->spt));
8780
+ sp->slot_bitmap = 0;
8781
+ sp->multimapped = 0;
8782
+ sp->parent_pte = parent_pte;
8783
--vcpu->kvm->n_free_mmu_pages;
8788
static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu,
8789
- struct kvm_mmu_page *page, u64 *parent_pte)
8790
+ struct kvm_mmu_page *sp, u64 *parent_pte)
8792
struct kvm_pte_chain *pte_chain;
8793
struct hlist_node *node;
8794
@@ -504,20 +584,20 @@ static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu,
8798
- if (!page->multimapped) {
8799
- u64 *old = page->parent_pte;
8800
+ if (!sp->multimapped) {
8801
+ u64 *old = sp->parent_pte;
8804
- page->parent_pte = parent_pte;
8805
+ sp->parent_pte = parent_pte;
8808
- page->multimapped = 1;
8809
+ sp->multimapped = 1;
8810
pte_chain = mmu_alloc_pte_chain(vcpu);
8811
- INIT_HLIST_HEAD(&page->parent_ptes);
8812
- hlist_add_head(&pte_chain->link, &page->parent_ptes);
8813
+ INIT_HLIST_HEAD(&sp->parent_ptes);
8814
+ hlist_add_head(&pte_chain->link, &sp->parent_ptes);
8815
pte_chain->parent_ptes[0] = old;
8817
- hlist_for_each_entry(pte_chain, node, &page->parent_ptes, link) {
8818
+ hlist_for_each_entry(pte_chain, node, &sp->parent_ptes, link) {
8819
if (pte_chain->parent_ptes[NR_PTE_CHAIN_ENTRIES-1])
8821
for (i = 0; i < NR_PTE_CHAIN_ENTRIES; ++i)
8822
@@ -528,23 +608,23 @@ static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu,
8824
pte_chain = mmu_alloc_pte_chain(vcpu);
8826
- hlist_add_head(&pte_chain->link, &page->parent_ptes);
8827
+ hlist_add_head(&pte_chain->link, &sp->parent_ptes);
8828
pte_chain->parent_ptes[0] = parent_pte;
8831
-static void mmu_page_remove_parent_pte(struct kvm_mmu_page *page,
8832
+static void mmu_page_remove_parent_pte(struct kvm_mmu_page *sp,
8835
struct kvm_pte_chain *pte_chain;
8836
struct hlist_node *node;
8839
- if (!page->multimapped) {
8840
- BUG_ON(page->parent_pte != parent_pte);
8841
- page->parent_pte = NULL;
8842
+ if (!sp->multimapped) {
8843
+ BUG_ON(sp->parent_pte != parent_pte);
8844
+ sp->parent_pte = NULL;
8847
- hlist_for_each_entry(pte_chain, node, &page->parent_ptes, link)
8848
+ hlist_for_each_entry(pte_chain, node, &sp->parent_ptes, link)
8849
for (i = 0; i < NR_PTE_CHAIN_ENTRIES; ++i) {
8850
if (!pte_chain->parent_ptes[i])
8852
@@ -560,9 +640,9 @@ static void mmu_page_remove_parent_pte(struct kvm_mmu_page *page,
8854
hlist_del(&pte_chain->link);
8855
mmu_free_pte_chain(pte_chain);
8856
- if (hlist_empty(&page->parent_ptes)) {
8857
- page->multimapped = 0;
8858
- page->parent_pte = NULL;
8859
+ if (hlist_empty(&sp->parent_ptes)) {
8860
+ sp->multimapped = 0;
8861
+ sp->parent_pte = NULL;
8865
@@ -570,22 +650,21 @@ static void mmu_page_remove_parent_pte(struct kvm_mmu_page *page,
8869
-static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm_vcpu *vcpu,
8871
+static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm, gfn_t gfn)
8874
struct hlist_head *bucket;
8875
- struct kvm_mmu_page *page;
8876
+ struct kvm_mmu_page *sp;
8877
struct hlist_node *node;
8879
pgprintk("%s: looking for gfn %lx\n", __FUNCTION__, gfn);
8880
index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
8881
- bucket = &vcpu->kvm->mmu_page_hash[index];
8882
- hlist_for_each_entry(page, node, bucket, hash_link)
8883
- if (page->gfn == gfn && !page->role.metaphysical) {
8884
+ bucket = &kvm->mmu_page_hash[index];
8885
+ hlist_for_each_entry(sp, node, bucket, hash_link)
8886
+ if (sp->gfn == gfn && !sp->role.metaphysical) {
8887
pgprintk("%s: found role %x\n",
8888
- __FUNCTION__, page->role.word);
8890
+ __FUNCTION__, sp->role.word);
8895
@@ -602,7 +681,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
8898
struct hlist_head *bucket;
8899
- struct kvm_mmu_page *page;
8900
+ struct kvm_mmu_page *sp;
8901
struct hlist_node *node;
8904
@@ -619,38 +698,39 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
8906
index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
8907
bucket = &vcpu->kvm->mmu_page_hash[index];
8908
- hlist_for_each_entry(page, node, bucket, hash_link)
8909
- if (page->gfn == gfn && page->role.word == role.word) {
8910
- mmu_page_add_parent_pte(vcpu, page, parent_pte);
8911
+ hlist_for_each_entry(sp, node, bucket, hash_link)
8912
+ if (sp->gfn == gfn && sp->role.word == role.word) {
8913
+ mmu_page_add_parent_pte(vcpu, sp, parent_pte);
8914
pgprintk("%s: found\n", __FUNCTION__);
8918
- page = kvm_mmu_alloc_page(vcpu, parent_pte);
8921
+ sp = kvm_mmu_alloc_page(vcpu, parent_pte);
8924
pgprintk("%s: adding gfn %lx role %x\n", __FUNCTION__, gfn, role.word);
8926
- page->role = role;
8927
- hlist_add_head(&page->hash_link, bucket);
8930
+ hlist_add_head(&sp->hash_link, bucket);
8931
+ vcpu->mmu.prefetch_page(vcpu, sp);
8933
- rmap_write_protect(vcpu, gfn);
8935
+ rmap_write_protect(vcpu->kvm, gfn);
8939
static void kvm_mmu_page_unlink_children(struct kvm *kvm,
8940
- struct kvm_mmu_page *page)
8941
+ struct kvm_mmu_page *sp)
8950
- if (page->role.level == PT_PAGE_TABLE_LEVEL) {
8951
+ if (sp->role.level == PT_PAGE_TABLE_LEVEL) {
8952
for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
8953
- if (pt[i] & PT_PRESENT_MASK)
8954
- rmap_remove(&pt[i]);
8956
+ if (is_shadow_present_pte(pt[i]))
8957
+ rmap_remove(kvm, &pt[i]);
8958
+ pt[i] = shadow_trap_nonpresent_pte;
8960
kvm_flush_remote_tlbs(kvm);
8962
@@ -659,8 +739,8 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
8963
for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
8967
- if (!(ent & PT_PRESENT_MASK))
8968
+ pt[i] = shadow_trap_nonpresent_pte;
8969
+ if (!is_shadow_present_pte(ent))
8971
ent &= PT64_BASE_ADDR_MASK;
8972
mmu_page_remove_parent_pte(page_header(ent), &pt[i]);
8973
@@ -668,106 +748,120 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
8974
kvm_flush_remote_tlbs(kvm);
8977
-static void kvm_mmu_put_page(struct kvm_mmu_page *page,
8979
+static void kvm_mmu_put_page(struct kvm_mmu_page *sp, u64 *parent_pte)
8981
+ mmu_page_remove_parent_pte(sp, parent_pte);
8984
+static void kvm_mmu_reset_last_pte_updated(struct kvm *kvm)
8986
- mmu_page_remove_parent_pte(page, parent_pte);
8989
+ for (i = 0; i < KVM_MAX_VCPUS; ++i)
8990
+ if (kvm->vcpus[i])
8991
+ kvm->vcpus[i]->last_pte_updated = NULL;
8994
-static void kvm_mmu_zap_page(struct kvm *kvm,
8995
- struct kvm_mmu_page *page)
8996
+static void kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp)
9000
- while (page->multimapped || page->parent_pte) {
9001
- if (!page->multimapped)
9002
- parent_pte = page->parent_pte;
9003
+ ++kvm->stat.mmu_shadow_zapped;
9004
+ while (sp->multimapped || sp->parent_pte) {
9005
+ if (!sp->multimapped)
9006
+ parent_pte = sp->parent_pte;
9008
struct kvm_pte_chain *chain;
9010
- chain = container_of(page->parent_ptes.first,
9011
+ chain = container_of(sp->parent_ptes.first,
9012
struct kvm_pte_chain, link);
9013
parent_pte = chain->parent_ptes[0];
9015
BUG_ON(!parent_pte);
9016
- kvm_mmu_put_page(page, parent_pte);
9017
- set_shadow_pte(parent_pte, 0);
9018
+ kvm_mmu_put_page(sp, parent_pte);
9019
+ set_shadow_pte(parent_pte, shadow_trap_nonpresent_pte);
9021
- kvm_mmu_page_unlink_children(kvm, page);
9022
- if (!page->root_count) {
9023
- hlist_del(&page->hash_link);
9024
- kvm_mmu_free_page(kvm, page);
9025
+ kvm_mmu_page_unlink_children(kvm, sp);
9026
+ if (!sp->root_count) {
9027
+ hlist_del(&sp->hash_link);
9028
+ kvm_mmu_free_page(kvm, sp);
9030
- list_move(&page->link, &kvm->active_mmu_pages);
9031
+ list_move(&sp->link, &kvm->active_mmu_pages);
9032
+ kvm_mmu_reset_last_pte_updated(kvm);
9036
+ * Changing the number of mmu pages allocated to the vm
9037
+ * Note: if kvm_nr_mmu_pages is too small, you will get dead lock
9039
+void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages)
9042
+ * If we set the number of mmu pages to be smaller be than the
9043
+ * number of actived pages , we must to free some mmu pages before we
9044
+ * change the value
9047
+ if ((kvm->n_alloc_mmu_pages - kvm->n_free_mmu_pages) >
9048
+ kvm_nr_mmu_pages) {
9049
+ int n_used_mmu_pages = kvm->n_alloc_mmu_pages
9050
+ - kvm->n_free_mmu_pages;
9052
+ while (n_used_mmu_pages > kvm_nr_mmu_pages) {
9053
+ struct kvm_mmu_page *page;
9055
+ page = container_of(kvm->active_mmu_pages.prev,
9056
+ struct kvm_mmu_page, link);
9057
+ kvm_mmu_zap_page(kvm, page);
9058
+ n_used_mmu_pages--;
9060
+ kvm->n_free_mmu_pages = 0;
9063
+ kvm->n_free_mmu_pages += kvm_nr_mmu_pages
9064
+ - kvm->n_alloc_mmu_pages;
9066
+ kvm->n_alloc_mmu_pages = kvm_nr_mmu_pages;
9069
-static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
9070
+static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
9073
struct hlist_head *bucket;
9074
- struct kvm_mmu_page *page;
9075
+ struct kvm_mmu_page *sp;
9076
struct hlist_node *node, *n;
9079
pgprintk("%s: looking for gfn %lx\n", __FUNCTION__, gfn);
9081
index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
9082
- bucket = &vcpu->kvm->mmu_page_hash[index];
9083
- hlist_for_each_entry_safe(page, node, n, bucket, hash_link)
9084
- if (page->gfn == gfn && !page->role.metaphysical) {
9085
+ bucket = &kvm->mmu_page_hash[index];
9086
+ hlist_for_each_entry_safe(sp, node, n, bucket, hash_link)
9087
+ if (sp->gfn == gfn && !sp->role.metaphysical) {
9088
pgprintk("%s: gfn %lx role %x\n", __FUNCTION__, gfn,
9090
- kvm_mmu_zap_page(vcpu->kvm, page);
9092
+ kvm_mmu_zap_page(kvm, sp);
9098
-static void mmu_unshadow(struct kvm_vcpu *vcpu, gfn_t gfn)
9099
+static void mmu_unshadow(struct kvm *kvm, gfn_t gfn)
9101
- struct kvm_mmu_page *page;
9102
+ struct kvm_mmu_page *sp;
9104
- while ((page = kvm_mmu_lookup_page(vcpu, gfn)) != NULL) {
9105
- pgprintk("%s: zap %lx %x\n",
9106
- __FUNCTION__, gfn, page->role.word);
9107
- kvm_mmu_zap_page(vcpu->kvm, page);
9108
+ while ((sp = kvm_mmu_lookup_page(kvm, gfn)) != NULL) {
9109
+ pgprintk("%s: zap %lx %x\n", __FUNCTION__, gfn, sp->role.word);
9110
+ kvm_mmu_zap_page(kvm, sp);
9114
-static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
9116
- int slot = memslot_id(kvm, gfn_to_memslot(kvm, gpa >> PAGE_SHIFT));
9117
- struct kvm_mmu_page *page_head = page_header(__pa(pte));
9119
- __set_bit(slot, &page_head->slot_bitmap);
9122
-hpa_t safe_gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
9124
- hpa_t hpa = gpa_to_hpa(vcpu, gpa);
9126
- return is_error_hpa(hpa) ? bad_page_address | (gpa & ~PAGE_MASK): hpa;
9129
-hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
9130
+static void page_header_update_slot(struct kvm *kvm, void *pte, gfn_t gfn)
9132
- struct page *page;
9134
- ASSERT((gpa & HPA_ERR_MASK) == 0);
9135
- page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
9137
- return gpa | HPA_ERR_MASK;
9138
- return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
9139
- | (gpa & (PAGE_SIZE-1));
9142
-hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
9144
- gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
9145
+ int slot = memslot_id(kvm, gfn_to_memslot(kvm, gfn));
9146
+ struct kvm_mmu_page *sp = page_header(__pa(pte));
9148
- if (gpa == UNMAPPED_GVA)
9149
- return UNMAPPED_GVA;
9150
- return gpa_to_hpa(vcpu, gpa);
9151
+ __set_bit(slot, &sp->slot_bitmap);
9154
struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
9155
@@ -776,14 +870,14 @@ struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
9157
if (gpa == UNMAPPED_GVA)
9159
- return pfn_to_page(gpa_to_hpa(vcpu, gpa) >> PAGE_SHIFT);
9160
+ return gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
9163
static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
9167
-static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
9168
+static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, struct page *page)
9170
int level = PT32E_ROOT_LEVEL;
9171
hpa_t table_addr = vcpu->mmu.root_hpa;
9172
@@ -797,18 +891,29 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
9173
table = __va(table_addr);
9179
- if (is_present_pte(pte) && is_writeble_pte(pte))
9180
+ was_rmapped = is_rmap_pte(pte);
9181
+ if (is_shadow_present_pte(pte) && is_writeble_pte(pte)) {
9182
+ kvm_release_page_clean(page);
9185
mark_page_dirty(vcpu->kvm, v >> PAGE_SHIFT);
9186
- page_header_update_slot(vcpu->kvm, table, v);
9187
- table[index] = p | PT_PRESENT_MASK | PT_WRITABLE_MASK |
9189
- rmap_add(vcpu, &table[index]);
9190
+ page_header_update_slot(vcpu->kvm, table,
9192
+ table[index] = page_to_phys(page)
9193
+ | PT_PRESENT_MASK | PT_WRITABLE_MASK
9196
+ rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
9198
+ kvm_release_page_clean(page);
9203
- if (table[index] == 0) {
9204
+ if (table[index] == shadow_trap_nonpresent_pte) {
9205
struct kvm_mmu_page *new_table;
9208
@@ -816,9 +921,10 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
9210
new_table = kvm_mmu_get_page(vcpu, pseudo_gfn,
9212
- 1, 0, &table[index]);
9213
+ 1, 3, &table[index]);
9215
pgprintk("nonpaging_map: ENOMEM\n");
9216
+ kvm_release_page_clean(page);
9220
@@ -829,10 +935,19 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
9224
+static void nonpaging_prefetch_page(struct kvm_vcpu *vcpu,
9225
+ struct kvm_mmu_page *sp)
9229
+ for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
9230
+ sp->spt[i] = shadow_trap_nonpresent_pte;
9233
static void mmu_free_roots(struct kvm_vcpu *vcpu)
9236
- struct kvm_mmu_page *page;
9237
+ struct kvm_mmu_page *sp;
9239
if (!VALID_PAGE(vcpu->mmu.root_hpa))
9241
@@ -840,8 +955,8 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
9242
if (vcpu->mmu.shadow_root_level == PT64_ROOT_LEVEL) {
9243
hpa_t root = vcpu->mmu.root_hpa;
9245
- page = page_header(root);
9246
- --page->root_count;
9247
+ sp = page_header(root);
9249
vcpu->mmu.root_hpa = INVALID_PAGE;
9252
@@ -851,8 +966,8 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
9255
root &= PT64_BASE_ADDR_MASK;
9256
- page = page_header(root);
9257
- --page->root_count;
9258
+ sp = page_header(root);
9261
vcpu->mmu.pae_root[i] = INVALID_PAGE;
9263
@@ -863,7 +978,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
9267
- struct kvm_mmu_page *page;
9268
+ struct kvm_mmu_page *sp;
9270
root_gfn = vcpu->cr3 >> PAGE_SHIFT;
9272
@@ -872,10 +987,10 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
9273
hpa_t root = vcpu->mmu.root_hpa;
9275
ASSERT(!VALID_PAGE(root));
9276
- page = kvm_mmu_get_page(vcpu, root_gfn, 0,
9277
- PT64_ROOT_LEVEL, 0, 0, NULL);
9278
- root = __pa(page->spt);
9279
- ++page->root_count;
9280
+ sp = kvm_mmu_get_page(vcpu, root_gfn, 0,
9281
+ PT64_ROOT_LEVEL, 0, 0, NULL);
9282
+ root = __pa(sp->spt);
9284
vcpu->mmu.root_hpa = root;
9287
@@ -892,11 +1007,11 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
9288
root_gfn = vcpu->pdptrs[i] >> PAGE_SHIFT;
9289
} else if (vcpu->mmu.root_level == 0)
9291
- page = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
9292
- PT32_ROOT_LEVEL, !is_paging(vcpu),
9294
- root = __pa(page->spt);
9295
- ++page->root_count;
9296
+ sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
9297
+ PT32_ROOT_LEVEL, !is_paging(vcpu),
9299
+ root = __pa(sp->spt);
9301
vcpu->mmu.pae_root[i] = root | PT_PRESENT_MASK;
9303
vcpu->mmu.root_hpa = __pa(vcpu->mmu.pae_root);
9304
@@ -908,10 +1023,9 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr)
9307
static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
9313
+ struct page *page;
9316
r = mmu_topup_memory_caches(vcpu);
9317
@@ -921,13 +1035,14 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
9319
ASSERT(VALID_PAGE(vcpu->mmu.root_hpa));
9321
+ page = gfn_to_page(vcpu->kvm, gva >> PAGE_SHIFT);
9323
- paddr = gpa_to_hpa(vcpu , addr & PT64_BASE_ADDR_MASK);
9325
- if (is_error_hpa(paddr))
9326
+ if (is_error_page(page)) {
9327
+ kvm_release_page_clean(page);
9331
- return nonpaging_map(vcpu, addr & PAGE_MASK, paddr);
9332
+ return nonpaging_map(vcpu, gva & PAGE_MASK, page);
9335
static void nonpaging_free(struct kvm_vcpu *vcpu)
9336
@@ -943,13 +1058,14 @@ static int nonpaging_init_context(struct kvm_vcpu *vcpu)
9337
context->page_fault = nonpaging_page_fault;
9338
context->gva_to_gpa = nonpaging_gva_to_gpa;
9339
context->free = nonpaging_free;
9340
+ context->prefetch_page = nonpaging_prefetch_page;
9341
context->root_level = 0;
9342
context->shadow_root_level = PT32E_ROOT_LEVEL;
9343
context->root_hpa = INVALID_PAGE;
9347
-static void kvm_mmu_flush_tlb(struct kvm_vcpu *vcpu)
9348
+void kvm_mmu_flush_tlb(struct kvm_vcpu *vcpu)
9350
++vcpu->stat.tlb_flush;
9351
kvm_x86_ops->tlb_flush(vcpu);
9352
@@ -989,6 +1105,7 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
9353
context->new_cr3 = paging_new_cr3;
9354
context->page_fault = paging64_page_fault;
9355
context->gva_to_gpa = paging64_gva_to_gpa;
9356
+ context->prefetch_page = paging64_prefetch_page;
9357
context->free = paging_free;
9358
context->root_level = level;
9359
context->shadow_root_level = level;
9360
@@ -1009,6 +1126,7 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
9361
context->page_fault = paging32_page_fault;
9362
context->gva_to_gpa = paging32_gva_to_gpa;
9363
context->free = paging_free;
9364
+ context->prefetch_page = paging32_prefetch_page;
9365
context->root_level = PT32_ROOT_LEVEL;
9366
context->shadow_root_level = PT32E_ROOT_LEVEL;
9367
context->root_hpa = INVALID_PAGE;
9368
@@ -1074,47 +1192,79 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu)
9371
static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
9372
- struct kvm_mmu_page *page,
9373
+ struct kvm_mmu_page *sp,
9377
struct kvm_mmu_page *child;
9380
- if (is_present_pte(pte)) {
9381
- if (page->role.level == PT_PAGE_TABLE_LEVEL)
9382
- rmap_remove(spte);
9383
+ if (is_shadow_present_pte(pte)) {
9384
+ if (sp->role.level == PT_PAGE_TABLE_LEVEL)
9385
+ rmap_remove(vcpu->kvm, spte);
9387
child = page_header(pte & PT64_BASE_ADDR_MASK);
9388
mmu_page_remove_parent_pte(child, spte);
9391
- set_shadow_pte(spte, 0);
9392
- kvm_flush_remote_tlbs(vcpu->kvm);
9393
+ set_shadow_pte(spte, shadow_trap_nonpresent_pte);
9396
static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
9397
- struct kvm_mmu_page *page,
9398
+ struct kvm_mmu_page *sp,
9400
- const void *new, int bytes)
9401
+ const void *new, int bytes,
9402
+ int offset_in_pte)
9404
- if (page->role.level != PT_PAGE_TABLE_LEVEL)
9405
+ if (sp->role.level != PT_PAGE_TABLE_LEVEL) {
9406
+ ++vcpu->kvm->stat.mmu_pde_zapped;
9410
- if (page->role.glevels == PT32_ROOT_LEVEL)
9411
- paging32_update_pte(vcpu, page, spte, new, bytes);
9412
+ ++vcpu->kvm->stat.mmu_pte_updated;
9413
+ if (sp->role.glevels == PT32_ROOT_LEVEL)
9414
+ paging32_update_pte(vcpu, sp, spte, new, bytes, offset_in_pte);
9416
+ paging64_update_pte(vcpu, sp, spte, new, bytes, offset_in_pte);
9419
+static bool need_remote_flush(u64 old, u64 new)
9421
+ if (!is_shadow_present_pte(old))
9423
+ if (!is_shadow_present_pte(new))
9425
+ if ((old ^ new) & PT64_BASE_ADDR_MASK)
9427
+ old ^= PT64_NX_MASK;
9428
+ new ^= PT64_NX_MASK;
9429
+ return (old & ~new & PT64_PERM_MASK) != 0;
9432
+static void mmu_pte_write_flush_tlb(struct kvm_vcpu *vcpu, u64 old, u64 new)
9434
+ if (need_remote_flush(old, new))
9435
+ kvm_flush_remote_tlbs(vcpu->kvm);
9437
- paging64_update_pte(vcpu, page, spte, new, bytes);
9438
+ kvm_mmu_flush_tlb(vcpu);
9441
+static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
9443
+ u64 *spte = vcpu->last_pte_updated;
9445
+ return !!(spte && (*spte & PT_ACCESSED_MASK));
9448
void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
9449
const u8 *new, int bytes)
9451
gfn_t gfn = gpa >> PAGE_SHIFT;
9452
- struct kvm_mmu_page *page;
9453
+ struct kvm_mmu_page *sp;
9454
struct hlist_node *node, *n;
9455
struct hlist_head *bucket;
9459
unsigned offset = offset_in_page(gpa);
9461
@@ -1126,20 +1276,24 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
9464
pgprintk("%s: gpa %llx bytes %d\n", __FUNCTION__, gpa, bytes);
9465
- if (gfn == vcpu->last_pt_write_gfn) {
9466
+ ++vcpu->kvm->stat.mmu_pte_write;
9467
+ kvm_mmu_audit(vcpu, "pre pte write");
9468
+ if (gfn == vcpu->last_pt_write_gfn
9469
+ && !last_updated_pte_accessed(vcpu)) {
9470
++vcpu->last_pt_write_count;
9471
if (vcpu->last_pt_write_count >= 3)
9474
vcpu->last_pt_write_gfn = gfn;
9475
vcpu->last_pt_write_count = 1;
9476
+ vcpu->last_pte_updated = NULL;
9478
index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
9479
bucket = &vcpu->kvm->mmu_page_hash[index];
9480
- hlist_for_each_entry_safe(page, node, n, bucket, hash_link) {
9481
- if (page->gfn != gfn || page->role.metaphysical)
9482
+ hlist_for_each_entry_safe(sp, node, n, bucket, hash_link) {
9483
+ if (sp->gfn != gfn || sp->role.metaphysical)
9485
- pte_size = page->role.glevels == PT32_ROOT_LEVEL ? 4 : 8;
9486
+ pte_size = sp->role.glevels == PT32_ROOT_LEVEL ? 4 : 8;
9487
misaligned = (offset ^ (offset + bytes - 1)) & ~(pte_size - 1);
9488
misaligned |= bytes < 4;
9489
if (misaligned || flooded) {
9490
@@ -1154,14 +1308,15 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
9493
pgprintk("misaligned: gpa %llx bytes %d role %x\n",
9494
- gpa, bytes, page->role.word);
9495
- kvm_mmu_zap_page(vcpu->kvm, page);
9496
+ gpa, bytes, sp->role.word);
9497
+ kvm_mmu_zap_page(vcpu->kvm, sp);
9498
+ ++vcpu->kvm->stat.mmu_flooded;
9501
page_offset = offset;
9502
- level = page->role.level;
9503
+ level = sp->role.level;
9505
- if (page->role.glevels == PT32_ROOT_LEVEL) {
9506
+ if (sp->role.glevels == PT32_ROOT_LEVEL) {
9507
page_offset <<= 1; /* 32->64 */
9509
* A 32-bit pde maps 4MB while the shadow pdes map
9510
@@ -1175,44 +1330,89 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
9512
quadrant = page_offset >> PAGE_SHIFT;
9513
page_offset &= ~PAGE_MASK;
9514
- if (quadrant != page->role.quadrant)
9515
+ if (quadrant != sp->role.quadrant)
9518
- spte = &page->spt[page_offset / sizeof(*spte)];
9519
+ spte = &sp->spt[page_offset / sizeof(*spte)];
9521
- mmu_pte_write_zap_pte(vcpu, page, spte);
9522
- mmu_pte_write_new_pte(vcpu, page, spte, new, bytes);
9524
+ mmu_pte_write_zap_pte(vcpu, sp, spte);
9525
+ mmu_pte_write_new_pte(vcpu, sp, spte, new, bytes,
9526
+ page_offset & (pte_size - 1));
9527
+ mmu_pte_write_flush_tlb(vcpu, entry, *spte);
9531
+ kvm_mmu_audit(vcpu, "post pte write");
9534
int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
9536
gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
9538
- return kvm_mmu_unprotect_page(vcpu, gpa >> PAGE_SHIFT);
9539
+ return kvm_mmu_unprotect_page(vcpu->kvm, gpa >> PAGE_SHIFT);
9542
void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
9544
while (vcpu->kvm->n_free_mmu_pages < KVM_REFILL_PAGES) {
9545
- struct kvm_mmu_page *page;
9546
+ struct kvm_mmu_page *sp;
9548
- page = container_of(vcpu->kvm->active_mmu_pages.prev,
9549
- struct kvm_mmu_page, link);
9550
- kvm_mmu_zap_page(vcpu->kvm, page);
9551
+ sp = container_of(vcpu->kvm->active_mmu_pages.prev,
9552
+ struct kvm_mmu_page, link);
9553
+ kvm_mmu_zap_page(vcpu->kvm, sp);
9554
+ ++vcpu->kvm->stat.mmu_recycled;
9558
+int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u32 error_code)
9561
+ enum emulation_result er;
9563
+ mutex_lock(&vcpu->kvm->lock);
9564
+ r = vcpu->mmu.page_fault(vcpu, cr2, error_code);
9573
+ r = mmu_topup_memory_caches(vcpu);
9577
+ er = emulate_instruction(vcpu, vcpu->run, cr2, error_code, 0);
9578
+ mutex_unlock(&vcpu->kvm->lock);
9581
+ case EMULATE_DONE:
9583
+ case EMULATE_DO_MMIO:
9584
+ ++vcpu->stat.mmio_exits;
9586
+ case EMULATE_FAIL:
9587
+ kvm_report_emulation_failure(vcpu, "pagetable");
9593
+ mutex_unlock(&vcpu->kvm->lock);
9596
+EXPORT_SYMBOL_GPL(kvm_mmu_page_fault);
9598
static void free_mmu_pages(struct kvm_vcpu *vcpu)
9600
- struct kvm_mmu_page *page;
9601
+ struct kvm_mmu_page *sp;
9603
while (!list_empty(&vcpu->kvm->active_mmu_pages)) {
9604
- page = container_of(vcpu->kvm->active_mmu_pages.next,
9605
- struct kvm_mmu_page, link);
9606
- kvm_mmu_zap_page(vcpu->kvm, page);
9607
+ sp = container_of(vcpu->kvm->active_mmu_pages.next,
9608
+ struct kvm_mmu_page, link);
9609
+ kvm_mmu_zap_page(vcpu->kvm, sp);
9611
free_page((unsigned long)vcpu->mmu.pae_root);
9613
@@ -1224,8 +1424,10 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
9617
- vcpu->kvm->n_free_mmu_pages = KVM_NUM_MMU_PAGES;
9619
+ if (vcpu->kvm->n_requested_mmu_pages)
9620
+ vcpu->kvm->n_free_mmu_pages = vcpu->kvm->n_requested_mmu_pages;
9622
+ vcpu->kvm->n_free_mmu_pages = vcpu->kvm->n_alloc_mmu_pages;
9624
* When emulating 32-bit mode, cr3 is only 32 bits even on x86_64.
9625
* Therefore we need to allocate shadow page tables in the first
9626
@@ -1272,31 +1474,29 @@ void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
9628
void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
9630
- struct kvm_mmu_page *page;
9631
+ struct kvm_mmu_page *sp;
9633
- list_for_each_entry(page, &kvm->active_mmu_pages, link) {
9634
+ list_for_each_entry(sp, &kvm->active_mmu_pages, link) {
9638
- if (!test_bit(slot, &page->slot_bitmap))
9639
+ if (!test_bit(slot, &sp->slot_bitmap))
9644
for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
9646
- if (pt[i] & PT_WRITABLE_MASK) {
9647
- rmap_remove(&pt[i]);
9648
+ if (pt[i] & PT_WRITABLE_MASK)
9649
pt[i] &= ~PT_WRITABLE_MASK;
9654
void kvm_mmu_zap_all(struct kvm *kvm)
9656
- struct kvm_mmu_page *page, *node;
9657
+ struct kvm_mmu_page *sp, *node;
9659
- list_for_each_entry_safe(page, node, &kvm->active_mmu_pages, link)
9660
- kvm_mmu_zap_page(kvm, page);
9661
+ list_for_each_entry_safe(sp, node, &kvm->active_mmu_pages, link)
9662
+ kvm_mmu_zap_page(kvm, sp);
9664
kvm_flush_remote_tlbs(kvm);
9666
@@ -1337,6 +1537,25 @@ nomem:
9671
+ * Caculate mmu pages needed for kvm.
9673
+unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
9676
+ unsigned int nr_mmu_pages;
9677
+ unsigned int nr_pages = 0;
9679
+ for (i = 0; i < kvm->nmemslots; i++)
9680
+ nr_pages += kvm->memslots[i].npages;
9682
+ nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
9683
+ nr_mmu_pages = max(nr_mmu_pages,
9684
+ (unsigned int) KVM_MIN_ALLOC_MMU_PAGES);
9686
+ return nr_mmu_pages;
9691
static const char *audit_msg;
9692
@@ -1359,22 +1578,36 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
9693
for (i = 0; i < PT64_ENT_PER_PAGE; ++i, va += va_delta) {
9696
- if (!(ent & PT_PRESENT_MASK))
9697
+ if (ent == shadow_trap_nonpresent_pte)
9700
va = canonicalize(va);
9703
+ if (ent == shadow_notrap_nonpresent_pte)
9704
+ printk(KERN_ERR "audit: (%s) nontrapping pte"
9705
+ " in nonleaf level: levels %d gva %lx"
9706
+ " level %d pte %llx\n", audit_msg,
9707
+ vcpu->mmu.root_level, va, level, ent);
9709
audit_mappings_page(vcpu, ent, va, level - 1);
9712
gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, va);
9713
- hpa_t hpa = gpa_to_hpa(vcpu, gpa);
9714
+ struct page *page = gpa_to_page(vcpu, gpa);
9715
+ hpa_t hpa = page_to_phys(page);
9717
- if ((ent & PT_PRESENT_MASK)
9718
+ if (is_shadow_present_pte(ent)
9719
&& (ent & PT64_BASE_ADDR_MASK) != hpa)
9720
- printk(KERN_ERR "audit error: (%s) levels %d"
9721
- " gva %lx gpa %llx hpa %llx ent %llx\n",
9722
+ printk(KERN_ERR "xx audit error: (%s) levels %d"
9723
+ " gva %lx gpa %llx hpa %llx ent %llx %d\n",
9724
audit_msg, vcpu->mmu.root_level,
9725
- va, gpa, hpa, ent);
9726
+ va, gpa, hpa, ent,
9727
+ is_shadow_present_pte(ent));
9728
+ else if (ent == shadow_notrap_nonpresent_pte
9729
+ && !is_error_hpa(hpa))
9730
+ printk(KERN_ERR "audit: (%s) notrap shadow,"
9731
+ " valid guest gva %lx\n", audit_msg, va);
9732
+ kvm_release_page_clean(page);
9737
@@ -1404,15 +1637,15 @@ static int count_rmaps(struct kvm_vcpu *vcpu)
9738
struct kvm_rmap_desc *d;
9740
for (j = 0; j < m->npages; ++j) {
9741
- struct page *page = m->phys_mem[j];
9742
+ unsigned long *rmapp = &m->rmap[j];
9744
- if (!page->private)
9747
- if (!(page->private & 1)) {
9748
+ if (!(*rmapp & 1)) {
9752
- d = (struct kvm_rmap_desc *)(page->private & ~1ul);
9753
+ d = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
9755
for (k = 0; k < RMAP_EXT; ++k)
9756
if (d->shadow_ptes[k])
9757
@@ -1429,13 +1662,13 @@ static int count_rmaps(struct kvm_vcpu *vcpu)
9758
static int count_writable_mappings(struct kvm_vcpu *vcpu)
9761
- struct kvm_mmu_page *page;
9762
+ struct kvm_mmu_page *sp;
9765
- list_for_each_entry(page, &vcpu->kvm->active_mmu_pages, link) {
9766
- u64 *pt = page->spt;
9767
+ list_for_each_entry(sp, &vcpu->kvm->active_mmu_pages, link) {
9768
+ u64 *pt = sp->spt;
9770
- if (page->role.level != PT_PAGE_TABLE_LEVEL)
9771
+ if (sp->role.level != PT_PAGE_TABLE_LEVEL)
9774
for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
9775
@@ -1463,23 +1696,23 @@ static void audit_rmap(struct kvm_vcpu *vcpu)
9777
static void audit_write_protection(struct kvm_vcpu *vcpu)
9779
- struct kvm_mmu_page *page;
9781
- list_for_each_entry(page, &vcpu->kvm->active_mmu_pages, link) {
9784
+ struct kvm_mmu_page *sp;
9785
+ struct kvm_memory_slot *slot;
9786
+ unsigned long *rmapp;
9789
- if (page->role.metaphysical)
9790
+ list_for_each_entry(sp, &vcpu->kvm->active_mmu_pages, link) {
9791
+ if (sp->role.metaphysical)
9794
- hfn = gpa_to_hpa(vcpu, (gpa_t)page->gfn << PAGE_SHIFT)
9796
- pg = pfn_to_page(hfn);
9798
+ slot = gfn_to_memslot(vcpu->kvm, sp->gfn);
9799
+ gfn = unalias_gfn(vcpu->kvm, sp->gfn);
9800
+ rmapp = &slot->rmap[gfn - slot->base_gfn];
9802
printk(KERN_ERR "%s: (%s) shadow page has writable"
9803
" mappings: gfn %lx role %x\n",
9804
- __FUNCTION__, audit_msg, page->gfn,
9806
+ __FUNCTION__, audit_msg, sp->gfn,
9811
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
9812
index 6b094b4..b24bc7c 100644
9813
--- a/drivers/kvm/paging_tmpl.h
9814
+++ b/drivers/kvm/paging_tmpl.h
9816
#define PT_INDEX(addr, level) PT64_INDEX(addr, level)
9817
#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
9818
#define PT_LEVEL_MASK(level) PT64_LEVEL_MASK(level)
9819
+ #define PT_LEVEL_BITS PT64_LEVEL_BITS
9820
#ifdef CONFIG_X86_64
9821
#define PT_MAX_FULL_LEVELS 4
9824
#define PT_INDEX(addr, level) PT32_INDEX(addr, level)
9825
#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
9826
#define PT_LEVEL_MASK(level) PT32_LEVEL_MASK(level)
9827
+ #define PT_LEVEL_BITS PT32_LEVEL_BITS
9828
#define PT_MAX_FULL_LEVELS 2
9830
#error Invalid PTTYPE value
9833
+#define gpte_to_gfn FNAME(gpte_to_gfn)
9834
+#define gpte_to_gfn_pde FNAME(gpte_to_gfn_pde)
9837
* The guest_walker structure emulates the behavior of the hardware page
9840
struct guest_walker {
9842
gfn_t table_gfn[PT_MAX_FULL_LEVELS];
9843
- pt_element_t *table;
9845
- pt_element_t *ptep;
9846
- struct page *page;
9848
pt_element_t inherited_ar;
9853
+static gfn_t gpte_to_gfn(pt_element_t gpte)
9855
+ return (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
9858
+static gfn_t gpte_to_gfn_pde(pt_element_t gpte)
9860
+ return (gpte & PT_DIR_BASE_ADDR_MASK) >> PAGE_SHIFT;
9864
* Fetch a guest pte for a guest virtual address
9866
@@ -74,103 +85,88 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
9867
struct kvm_vcpu *vcpu, gva_t addr,
9868
int write_fault, int user_fault, int fetch_fault)
9871
- struct kvm_memory_slot *slot;
9872
- pt_element_t *ptep;
9873
- pt_element_t root;
9879
pgprintk("%s: addr %lx\n", __FUNCTION__, addr);
9880
walker->level = vcpu->mmu.root_level;
9881
- walker->table = NULL;
9882
- walker->page = NULL;
9883
- walker->ptep = NULL;
9887
if (!is_long_mode(vcpu)) {
9888
- walker->ptep = &vcpu->pdptrs[(addr >> 30) & 3];
9889
- root = *walker->ptep;
9890
- walker->pte = root;
9891
- if (!(root & PT_PRESENT_MASK))
9892
+ pte = vcpu->pdptrs[(addr >> 30) & 3];
9893
+ if (!is_present_pte(pte))
9898
- table_gfn = (root & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
9899
- walker->table_gfn[walker->level - 1] = table_gfn;
9900
- pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
9901
- walker->level - 1, table_gfn);
9902
- slot = gfn_to_memslot(vcpu->kvm, table_gfn);
9903
- hpa = safe_gpa_to_hpa(vcpu, root & PT64_BASE_ADDR_MASK);
9904
- walker->page = pfn_to_page(hpa >> PAGE_SHIFT);
9905
- walker->table = kmap_atomic(walker->page, KM_USER0);
9907
ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
9908
(vcpu->cr3 & CR3_NONPAE_RESERVED_BITS) == 0);
9910
walker->inherited_ar = PT_USER_MASK | PT_WRITABLE_MASK;
9913
- int index = PT_INDEX(addr, walker->level);
9915
+ index = PT_INDEX(addr, walker->level);
9917
- ptep = &walker->table[index];
9918
- walker->index = index;
9919
- ASSERT(((unsigned long)walker->table & PAGE_MASK) ==
9920
- ((unsigned long)ptep & PAGE_MASK));
9921
+ table_gfn = gpte_to_gfn(pte);
9922
+ pte_gpa = gfn_to_gpa(table_gfn);
9923
+ pte_gpa += index * sizeof(pt_element_t);
9924
+ walker->table_gfn[walker->level - 1] = table_gfn;
9925
+ pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
9926
+ walker->level - 1, table_gfn);
9928
- if (!is_present_pte(*ptep))
9929
+ kvm_read_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
9931
+ if (!is_present_pte(pte))
9934
- if (write_fault && !is_writeble_pte(*ptep))
9935
+ if (write_fault && !is_writeble_pte(pte))
9936
if (user_fault || is_write_protection(vcpu))
9939
- if (user_fault && !(*ptep & PT_USER_MASK))
9940
+ if (user_fault && !(pte & PT_USER_MASK))
9944
- if (fetch_fault && is_nx(vcpu) && (*ptep & PT64_NX_MASK))
9945
+ if (fetch_fault && is_nx(vcpu) && (pte & PT64_NX_MASK))
9949
- if (!(*ptep & PT_ACCESSED_MASK)) {
9950
+ if (!(pte & PT_ACCESSED_MASK)) {
9951
mark_page_dirty(vcpu->kvm, table_gfn);
9952
- *ptep |= PT_ACCESSED_MASK;
9953
+ pte |= PT_ACCESSED_MASK;
9954
+ kvm_write_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
9957
if (walker->level == PT_PAGE_TABLE_LEVEL) {
9958
- walker->gfn = (*ptep & PT_BASE_ADDR_MASK)
9960
+ walker->gfn = gpte_to_gfn(pte);
9964
if (walker->level == PT_DIRECTORY_LEVEL
9965
- && (*ptep & PT_PAGE_SIZE_MASK)
9966
+ && (pte & PT_PAGE_SIZE_MASK)
9967
&& (PTTYPE == 64 || is_pse(vcpu))) {
9968
- walker->gfn = (*ptep & PT_DIR_BASE_ADDR_MASK)
9970
+ walker->gfn = gpte_to_gfn_pde(pte);
9971
walker->gfn += PT_INDEX(addr, PT_PAGE_TABLE_LEVEL);
9972
+ if (PTTYPE == 32 && is_cpuid_PSE36())
9973
+ walker->gfn += pse36_gfn_delta(pte);
9977
- walker->inherited_ar &= walker->table[index];
9978
- table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
9979
- kunmap_atomic(walker->table, KM_USER0);
9980
- paddr = safe_gpa_to_hpa(vcpu, table_gfn << PAGE_SHIFT);
9981
- walker->page = pfn_to_page(paddr >> PAGE_SHIFT);
9982
- walker->table = kmap_atomic(walker->page, KM_USER0);
9983
+ walker->inherited_ar &= pte;
9985
- walker->table_gfn[walker->level - 1 ] = table_gfn;
9986
- pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
9987
- walker->level - 1, table_gfn);
9989
- walker->pte = *ptep;
9991
- walker->ptep = NULL;
9992
- if (walker->table)
9993
- kunmap_atomic(walker->table, KM_USER0);
9994
- pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)*ptep);
9996
+ if (write_fault && !is_dirty_pte(pte)) {
9997
+ mark_page_dirty(vcpu->kvm, table_gfn);
9998
+ pte |= PT_DIRTY_MASK;
9999
+ kvm_write_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
10000
+ kvm_mmu_pte_write(vcpu, pte_gpa, (u8 *)&pte, sizeof(pte));
10003
+ walker->pte = pte;
10004
+ pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)pte);
10008
@@ -187,75 +183,50 @@ err:
10009
walker->error_code |= PFERR_USER_MASK;
10011
walker->error_code |= PFERR_FETCH_MASK;
10012
- if (walker->table)
10013
- kunmap_atomic(walker->table, KM_USER0);
10017
-static void FNAME(mark_pagetable_dirty)(struct kvm *kvm,
10018
- struct guest_walker *walker)
10020
- mark_page_dirty(kvm, walker->table_gfn[walker->level - 1]);
10023
-static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
10026
- pt_element_t gpte,
10031
- struct guest_walker *walker,
10033
+static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t gpte,
10034
+ u64 *shadow_pte, u64 access_bits,
10035
+ int user_fault, int write_fault,
10036
+ int *ptwrite, struct guest_walker *walker,
10040
int dirty = gpte & PT_DIRTY_MASK;
10041
- u64 spte = *shadow_pte;
10042
- int was_rmapped = is_rmap_pte(spte);
10044
+ int was_rmapped = is_rmap_pte(*shadow_pte);
10045
+ struct page *page;
10047
pgprintk("%s: spte %llx gpte %llx access %llx write_fault %d"
10048
" user_fault %d gfn %lx\n",
10049
- __FUNCTION__, spte, (u64)gpte, access_bits,
10050
+ __FUNCTION__, *shadow_pte, (u64)gpte, access_bits,
10051
write_fault, user_fault, gfn);
10053
- if (write_fault && !dirty) {
10054
- pt_element_t *guest_ent, *tmp = NULL;
10056
- if (walker->ptep)
10057
- guest_ent = walker->ptep;
10059
- tmp = kmap_atomic(walker->page, KM_USER0);
10060
- guest_ent = &tmp[walker->index];
10063
- *guest_ent |= PT_DIRTY_MASK;
10064
- if (!walker->ptep)
10065
- kunmap_atomic(tmp, KM_USER0);
10067
- FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
10070
- spte |= PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
10071
+ access_bits &= gpte;
10073
+ * We don't set the accessed bit, since we sometimes want to see
10074
+ * whether the guest actually used the pte (in order to detect
10075
+ * demand paging).
10077
+ spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
10078
spte |= gpte & PT64_NX_MASK;
10080
access_bits &= ~PT_WRITABLE_MASK;
10082
- paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
10083
+ page = gfn_to_page(vcpu->kvm, gfn);
10085
spte |= PT_PRESENT_MASK;
10086
if (access_bits & PT_USER_MASK)
10087
spte |= PT_USER_MASK;
10089
- if (is_error_hpa(paddr)) {
10091
- spte |= PT_SHADOW_IO_MARK;
10092
- spte &= ~PT_PRESENT_MASK;
10093
- set_shadow_pte(shadow_pte, spte);
10094
+ if (is_error_page(page)) {
10095
+ set_shadow_pte(shadow_pte,
10096
+ shadow_trap_nonpresent_pte | PT_SHADOW_IO_MARK);
10097
+ kvm_release_page_clean(page);
10102
+ spte |= page_to_phys(page);
10104
if ((access_bits & PT_WRITABLE_MASK)
10105
|| (write_fault && !is_write_protection(vcpu) && !user_fault)) {
10106
@@ -263,11 +234,11 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
10108
spte |= PT_WRITABLE_MASK;
10110
- mmu_unshadow(vcpu, gfn);
10111
+ mmu_unshadow(vcpu->kvm, gfn);
10115
- shadow = kvm_mmu_lookup_page(vcpu, gfn);
10116
+ shadow = kvm_mmu_lookup_page(vcpu->kvm, gfn);
10118
pgprintk("%s: found shadow page for %lx, marking ro\n",
10119
__FUNCTION__, gfn);
10120
@@ -284,56 +255,39 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
10123
if (access_bits & PT_WRITABLE_MASK)
10124
- mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
10125
+ mark_page_dirty(vcpu->kvm, gfn);
10127
+ pgprintk("%s: setting spte %llx\n", __FUNCTION__, spte);
10128
set_shadow_pte(shadow_pte, spte);
10129
- page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
10130
- if (!was_rmapped)
10131
- rmap_add(vcpu, shadow_pte);
10134
-static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t gpte,
10135
- u64 *shadow_pte, u64 access_bits,
10136
- int user_fault, int write_fault, int *ptwrite,
10137
- struct guest_walker *walker, gfn_t gfn)
10139
- access_bits &= gpte;
10140
- FNAME(set_pte_common)(vcpu, shadow_pte, gpte & PT_BASE_ADDR_MASK,
10141
- gpte, access_bits, user_fault, write_fault,
10142
- ptwrite, walker, gfn);
10143
+ page_header_update_slot(vcpu->kvm, shadow_pte, gfn);
10144
+ if (!was_rmapped) {
10145
+ rmap_add(vcpu, shadow_pte, gfn);
10146
+ if (!is_rmap_pte(*shadow_pte))
10147
+ kvm_release_page_clean(page);
10150
+ kvm_release_page_clean(page);
10151
+ if (!ptwrite || !*ptwrite)
10152
+ vcpu->last_pte_updated = shadow_pte;
10155
static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
10156
- u64 *spte, const void *pte, int bytes)
10157
+ u64 *spte, const void *pte, int bytes,
10158
+ int offset_in_pte)
10162
- if (bytes < sizeof(pt_element_t))
10164
gpte = *(const pt_element_t *)pte;
10165
- if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
10166
+ if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK)) {
10167
+ if (!offset_in_pte && !is_present_pte(gpte))
10168
+ set_shadow_pte(spte, shadow_notrap_nonpresent_pte);
10171
+ if (bytes < sizeof(pt_element_t))
10173
pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
10174
FNAME(set_pte)(vcpu, gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK, 0,
10176
- (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT);
10179
-static void FNAME(set_pde)(struct kvm_vcpu *vcpu, pt_element_t gpde,
10180
- u64 *shadow_pte, u64 access_bits,
10181
- int user_fault, int write_fault, int *ptwrite,
10182
- struct guest_walker *walker, gfn_t gfn)
10186
- access_bits &= gpde;
10187
- gaddr = (gpa_t)gfn << PAGE_SHIFT;
10188
- if (PTTYPE == 32 && is_cpuid_PSE36())
10189
- gaddr |= (gpde & PT32_DIR_PSE36_MASK) <<
10190
- (32 - PT32_DIR_PSE36_SHIFT);
10191
- FNAME(set_pte_common)(vcpu, shadow_pte, gaddr,
10192
- gpde, access_bits, user_fault, write_fault,
10193
- ptwrite, walker, gfn);
10194
+ 0, NULL, NULL, gpte_to_gfn(gpte));
10198
@@ -368,7 +322,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
10199
unsigned hugepage_access = 0;
10201
shadow_ent = ((u64 *)__va(shadow_addr)) + index;
10202
- if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
10203
+ if (is_shadow_present_pte(*shadow_ent)) {
10204
if (level == PT_PAGE_TABLE_LEVEL)
10206
shadow_addr = *shadow_ent & PT64_BASE_ADDR_MASK;
10207
@@ -384,11 +338,12 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
10209
hugepage_access = walker->pte;
10210
hugepage_access &= PT_USER_MASK | PT_WRITABLE_MASK;
10211
+ if (!is_dirty_pte(walker->pte))
10212
+ hugepage_access &= ~PT_WRITABLE_MASK;
10213
+ hugepage_access >>= PT_WRITABLE_SHIFT;
10214
if (walker->pte & PT64_NX_MASK)
10215
hugepage_access |= (1 << 2);
10216
- hugepage_access >>= PT_WRITABLE_SHIFT;
10217
- table_gfn = (walker->pte & PT_BASE_ADDR_MASK)
10219
+ table_gfn = gpte_to_gfn(walker->pte);
10222
table_gfn = walker->table_gfn[level - 2];
10223
@@ -403,16 +358,10 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
10224
prev_shadow_ent = shadow_ent;
10227
- if (walker->level == PT_DIRECTORY_LEVEL) {
10228
- FNAME(set_pde)(vcpu, walker->pte, shadow_ent,
10229
- walker->inherited_ar, user_fault, write_fault,
10230
- ptwrite, walker, walker->gfn);
10232
- ASSERT(walker->level == PT_PAGE_TABLE_LEVEL);
10233
- FNAME(set_pte)(vcpu, walker->pte, shadow_ent,
10234
- walker->inherited_ar, user_fault, write_fault,
10235
- ptwrite, walker, walker->gfn);
10237
+ FNAME(set_pte)(vcpu, walker->pte, shadow_ent,
10238
+ walker->inherited_ar, user_fault, write_fault,
10239
+ ptwrite, walker, walker->gfn);
10244
@@ -493,13 +442,39 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
10245
r = FNAME(walk_addr)(&walker, vcpu, vaddr, 0, 0, 0);
10248
- gpa = (gpa_t)walker.gfn << PAGE_SHIFT;
10249
+ gpa = gfn_to_gpa(walker.gfn);
10250
gpa |= vaddr & ~PAGE_MASK;
10256
+static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
10257
+ struct kvm_mmu_page *sp)
10259
+ int i, offset = 0;
10260
+ pt_element_t *gpt;
10261
+ struct page *page;
10263
+ if (sp->role.metaphysical
10264
+ || (PTTYPE == 32 && sp->role.level > PT_PAGE_TABLE_LEVEL)) {
10265
+ nonpaging_prefetch_page(vcpu, sp);
10269
+ if (PTTYPE == 32)
10270
+ offset = sp->role.quadrant << PT64_LEVEL_BITS;
10271
+ page = gfn_to_page(vcpu->kvm, sp->gfn);
10272
+ gpt = kmap_atomic(page, KM_USER0);
10273
+ for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
10274
+ if (is_present_pte(gpt[offset + i]))
10275
+ sp->spt[i] = shadow_trap_nonpresent_pte;
10277
+ sp->spt[i] = shadow_notrap_nonpresent_pte;
10278
+ kunmap_atomic(gpt, KM_USER0);
10279
+ kvm_release_page_clean(page);
10282
#undef pt_element_t
10283
#undef guest_walker
10285
@@ -508,4 +483,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
10286
#undef SHADOW_PT_INDEX
10287
#undef PT_LEVEL_MASK
10288
#undef PT_DIR_BASE_ADDR_MASK
10289
+#undef PT_LEVEL_BITS
10290
#undef PT_MAX_FULL_LEVELS
10291
+#undef gpte_to_gfn
10292
+#undef gpte_to_gfn_pde
10293
diff --git a/drivers/kvm/segment_descriptor.h b/drivers/kvm/segment_descriptor.h
10294
index 71fdf45..56fc4c8 100644
10295
--- a/drivers/kvm/segment_descriptor.h
10296
+++ b/drivers/kvm/segment_descriptor.h
10298
+#ifndef __SEGMENT_DESCRIPTOR_H
10299
+#define __SEGMENT_DESCRIPTOR_H
10301
struct segment_descriptor {
10304
@@ -14,4 +17,13 @@ struct segment_descriptor {
10306
} __attribute__((packed));
10308
+#ifdef CONFIG_X86_64
10309
+/* LDT or TSS descriptor in the GDT. 16 bytes. */
10310
+struct segment_descriptor_64 {
10311
+ struct segment_descriptor s;
10318
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
10319
index 4e04e49..04e6b39 100644
10320
--- a/drivers/kvm/svm.c
10321
+++ b/drivers/kvm/svm.c
10323
* the COPYING file in the top-level directory.
10328
#include "kvm_svm.h"
10329
#include "x86_emulate.h"
10331
@@ -42,9 +42,6 @@ MODULE_LICENSE("GPL");
10332
#define SEG_TYPE_LDT 2
10333
#define SEG_TYPE_BUSY_TSS16 3
10335
-#define KVM_EFER_LMA (1 << 10)
10336
-#define KVM_EFER_LME (1 << 8)
10338
#define SVM_FEATURE_NPT (1 << 0)
10339
#define SVM_FEATURE_LBRV (1 << 1)
10340
#define SVM_DEATURE_SVML (1 << 2)
10341
@@ -184,8 +181,8 @@ static inline void flush_guest_tlb(struct kvm_vcpu *vcpu)
10343
static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
10345
- if (!(efer & KVM_EFER_LMA))
10346
- efer &= ~KVM_EFER_LME;
10347
+ if (!(efer & EFER_LMA))
10348
+ efer &= ~EFER_LME;
10350
to_svm(vcpu)->vmcb->save.efer = efer | MSR_EFER_SVME_MASK;
10351
vcpu->shadow_efer = efer;
10352
@@ -229,12 +226,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
10353
printk(KERN_DEBUG "%s: NOP\n", __FUNCTION__);
10356
- if (svm->next_rip - svm->vmcb->save.rip > MAX_INST_SIZE) {
10357
+ if (svm->next_rip - svm->vmcb->save.rip > MAX_INST_SIZE)
10358
printk(KERN_ERR "%s: ip 0x%llx next 0x%llx\n",
10360
svm->vmcb->save.rip,
10364
vcpu->rip = svm->vmcb->save.rip = svm->next_rip;
10365
svm->vmcb->control.int_state &= ~SVM_INTERRUPT_SHADOW_MASK;
10366
@@ -312,7 +308,7 @@ static void svm_hardware_enable(void *garbage)
10367
svm_data->next_asid = svm_data->max_asid + 1;
10368
svm_features = cpuid_edx(SVM_CPUID_FUNC);
10370
- asm volatile ( "sgdt %0" : "=m"(gdt_descr) );
10371
+ asm volatile ("sgdt %0" : "=m"(gdt_descr));
10372
gdt = (struct desc_struct *)gdt_descr.address;
10373
svm_data->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS);
10375
@@ -476,7 +472,8 @@ static void init_vmcb(struct vmcb *vmcb)
10376
INTERCEPT_DR5_MASK |
10377
INTERCEPT_DR7_MASK;
10379
- control->intercept_exceptions = 1 << PF_VECTOR;
10380
+ control->intercept_exceptions = (1 << PF_VECTOR) |
10381
+ (1 << UD_VECTOR);
10384
control->intercept = (1ULL << INTERCEPT_INTR) |
10385
@@ -543,8 +540,7 @@ static void init_vmcb(struct vmcb *vmcb)
10386
init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
10388
save->efer = MSR_EFER_SVME_MASK;
10390
- save->dr6 = 0xffff0ff0;
10391
+ save->dr6 = 0xffff0ff0;
10394
save->rip = 0x0000fff0;
10395
@@ -558,7 +554,7 @@ static void init_vmcb(struct vmcb *vmcb)
10399
-static void svm_vcpu_reset(struct kvm_vcpu *vcpu)
10400
+static int svm_vcpu_reset(struct kvm_vcpu *vcpu)
10402
struct vcpu_svm *svm = to_svm(vcpu);
10404
@@ -569,6 +565,8 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu)
10405
svm->vmcb->save.cs.base = svm->vcpu.sipi_vector << 12;
10406
svm->vmcb->save.cs.selector = svm->vcpu.sipi_vector << 8;
10412
static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
10413
@@ -587,12 +585,6 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
10417
- if (irqchip_in_kernel(kvm)) {
10418
- err = kvm_create_lapic(&svm->vcpu);
10423
page = alloc_page(GFP_KERNEL);
10426
@@ -659,11 +651,11 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
10427
struct vcpu_svm *svm = to_svm(vcpu);
10430
+ ++vcpu->stat.host_state_reload;
10431
for (i = 0; i < NR_HOST_SAVE_USER_MSRS; i++)
10432
wrmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
10434
rdtscll(vcpu->host_tsc);
10435
- kvm_put_guest_fpu(vcpu);
10438
static void svm_vcpu_decache(struct kvm_vcpu *vcpu)
10439
@@ -782,15 +774,15 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
10440
struct vcpu_svm *svm = to_svm(vcpu);
10442
#ifdef CONFIG_X86_64
10443
- if (vcpu->shadow_efer & KVM_EFER_LME) {
10444
+ if (vcpu->shadow_efer & EFER_LME) {
10445
if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
10446
- vcpu->shadow_efer |= KVM_EFER_LMA;
10447
- svm->vmcb->save.efer |= KVM_EFER_LMA | KVM_EFER_LME;
10448
+ vcpu->shadow_efer |= EFER_LMA;
10449
+ svm->vmcb->save.efer |= EFER_LMA | EFER_LME;
10452
- if (is_paging(vcpu) && !(cr0 & X86_CR0_PG) ) {
10453
- vcpu->shadow_efer &= ~KVM_EFER_LMA;
10454
- svm->vmcb->save.efer &= ~(KVM_EFER_LMA | KVM_EFER_LME);
10455
+ if (is_paging(vcpu) && !(cr0 & X86_CR0_PG)) {
10456
+ vcpu->shadow_efer &= ~EFER_LMA;
10457
+ svm->vmcb->save.efer &= ~(EFER_LMA | EFER_LME);
10461
@@ -938,45 +930,25 @@ static int pf_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10462
struct kvm *kvm = svm->vcpu.kvm;
10465
- enum emulation_result er;
10468
if (!irqchip_in_kernel(kvm) &&
10469
is_external_interrupt(exit_int_info))
10470
push_irq(&svm->vcpu, exit_int_info & SVM_EVTINJ_VEC_MASK);
10472
- mutex_lock(&kvm->lock);
10474
fault_address = svm->vmcb->control.exit_info_2;
10475
error_code = svm->vmcb->control.exit_info_1;
10476
- r = kvm_mmu_page_fault(&svm->vcpu, fault_address, error_code);
10478
- mutex_unlock(&kvm->lock);
10482
- mutex_unlock(&kvm->lock);
10485
- er = emulate_instruction(&svm->vcpu, kvm_run, fault_address,
10487
- mutex_unlock(&kvm->lock);
10488
+ return kvm_mmu_page_fault(&svm->vcpu, fault_address, error_code);
10492
- case EMULATE_DONE:
10494
- case EMULATE_DO_MMIO:
10495
- ++svm->vcpu.stat.mmio_exits;
10497
- case EMULATE_FAIL:
10498
- kvm_report_emulation_failure(&svm->vcpu, "pagetable");
10503
+static int ud_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10507
- kvm_run->exit_reason = KVM_EXIT_UNKNOWN;
10509
+ er = emulate_instruction(&svm->vcpu, kvm_run, 0, 0, 0);
10510
+ if (er != EMULATE_DONE)
10511
+ inject_ud(&svm->vcpu);
10516
static int nm_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10517
@@ -1004,7 +976,7 @@ static int shutdown_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10519
static int io_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10521
- u32 io_info = svm->vmcb->control.exit_info_1; //address size bug?
10522
+ u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */
10523
int size, down, in, string, rep;
10526
@@ -1015,7 +987,8 @@ static int io_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10527
string = (io_info & SVM_IOIO_STR_MASK) != 0;
10530
- if (emulate_instruction(&svm->vcpu, kvm_run, 0, 0) == EMULATE_DO_MMIO)
10531
+ if (emulate_instruction(&svm->vcpu,
10532
+ kvm_run, 0, 0, 0) == EMULATE_DO_MMIO)
10536
@@ -1045,7 +1018,8 @@ static int vmmcall_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10538
svm->next_rip = svm->vmcb->save.rip + 3;
10539
skip_emulated_instruction(&svm->vcpu);
10540
- return kvm_hypercall(&svm->vcpu, kvm_run);
10541
+ kvm_emulate_hypercall(&svm->vcpu);
10545
static int invalid_op_interception(struct vcpu_svm *svm,
10546
@@ -1073,7 +1047,7 @@ static int cpuid_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
10547
static int emulate_on_interception(struct vcpu_svm *svm,
10548
struct kvm_run *kvm_run)
10550
- if (emulate_instruction(&svm->vcpu, NULL, 0, 0) != EMULATE_DONE)
10551
+ if (emulate_instruction(&svm->vcpu, NULL, 0, 0, 0) != EMULATE_DONE)
10552
pr_unimpl(&svm->vcpu, "%s: failed\n", __FUNCTION__);
10555
@@ -1241,6 +1215,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm,
10556
[SVM_EXIT_WRITE_DR3] = emulate_on_interception,
10557
[SVM_EXIT_WRITE_DR5] = emulate_on_interception,
10558
[SVM_EXIT_WRITE_DR7] = emulate_on_interception,
10559
+ [SVM_EXIT_EXCP_BASE + UD_VECTOR] = ud_interception,
10560
[SVM_EXIT_EXCP_BASE + PF_VECTOR] = pf_interception,
10561
[SVM_EXIT_EXCP_BASE + NM_VECTOR] = nm_interception,
10562
[SVM_EXIT_INTR] = nop_on_interception,
10563
@@ -1293,7 +1268,7 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
10566
if (exit_code >= ARRAY_SIZE(svm_exit_handlers)
10567
- || svm_exit_handlers[exit_code] == 0) {
10568
+ || !svm_exit_handlers[exit_code]) {
10569
kvm_run->exit_reason = KVM_EXIT_UNKNOWN;
10570
kvm_run->hw.hardware_exit_reason = exit_code;
10572
@@ -1307,7 +1282,7 @@ static void reload_tss(struct kvm_vcpu *vcpu)
10573
int cpu = raw_smp_processor_id();
10575
struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu);
10576
- svm_data->tss_desc->type = 9; //available 32/64-bit TSS
10577
+ svm_data->tss_desc->type = 9; /* available 32/64-bit TSS */
10581
@@ -1348,7 +1323,6 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu)
10582
struct vmcb *vmcb = svm->vmcb;
10583
int intr_vector = -1;
10585
- kvm_inject_pending_timer_irqs(vcpu);
10586
if ((vmcb->control.exit_int_info & SVM_EVTINJ_VALID) &&
10587
((vmcb->control.exit_int_info & SVM_EVTINJ_TYPE_MASK) == 0)) {
10588
intr_vector = vmcb->control.exit_int_info &
10589
@@ -1425,12 +1399,17 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
10590
* Interrupts blocked. Wait for unblock.
10592
if (!svm->vcpu.interrupt_window_open &&
10593
- (svm->vcpu.irq_summary || kvm_run->request_interrupt_window)) {
10594
+ (svm->vcpu.irq_summary || kvm_run->request_interrupt_window))
10595
control->intercept |= 1ULL << INTERCEPT_VINTR;
10598
control->intercept &= ~(1ULL << INTERCEPT_VINTR);
10601
+static int svm_set_tss_addr(struct kvm *kvm, unsigned int addr)
10606
static void save_db_regs(unsigned long *db_regs)
10608
asm volatile ("mov %%dr0, %0" : "=r"(db_regs[0]));
10609
@@ -1486,13 +1465,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
10612
#ifdef CONFIG_X86_64
10613
- "push %%rbx; push %%rcx; push %%rdx;"
10614
- "push %%rsi; push %%rdi; push %%rbp;"
10615
- "push %%r8; push %%r9; push %%r10; push %%r11;"
10616
- "push %%r12; push %%r13; push %%r14; push %%r15;"
10617
+ "push %%rbp; \n\t"
10619
- "push %%ebx; push %%ecx; push %%edx;"
10620
- "push %%esi; push %%edi; push %%ebp;"
10621
+ "push %%ebp; \n\t"
10624
#ifdef CONFIG_X86_64
10625
@@ -1554,10 +1529,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
10626
"mov %%r14, %c[r14](%[svm]) \n\t"
10627
"mov %%r15, %c[r15](%[svm]) \n\t"
10629
- "pop %%r15; pop %%r14; pop %%r13; pop %%r12;"
10630
- "pop %%r11; pop %%r10; pop %%r9; pop %%r8;"
10631
- "pop %%rbp; pop %%rdi; pop %%rsi;"
10632
- "pop %%rdx; pop %%rcx; pop %%rbx; \n\t"
10633
+ "pop %%rbp; \n\t"
10635
"mov %%ebx, %c[rbx](%[svm]) \n\t"
10636
"mov %%ecx, %c[rcx](%[svm]) \n\t"
10637
@@ -1566,29 +1538,35 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
10638
"mov %%edi, %c[rdi](%[svm]) \n\t"
10639
"mov %%ebp, %c[rbp](%[svm]) \n\t"
10641
- "pop %%ebp; pop %%edi; pop %%esi;"
10642
- "pop %%edx; pop %%ecx; pop %%ebx; \n\t"
10643
+ "pop %%ebp; \n\t"
10647
[vmcb]"i"(offsetof(struct vcpu_svm, vmcb_pa)),
10648
- [rbx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RBX])),
10649
- [rcx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RCX])),
10650
- [rdx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RDX])),
10651
- [rsi]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RSI])),
10652
- [rdi]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RDI])),
10653
- [rbp]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RBP]))
10654
+ [rbx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RBX])),
10655
+ [rcx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RCX])),
10656
+ [rdx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RDX])),
10657
+ [rsi]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RSI])),
10658
+ [rdi]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RDI])),
10659
+ [rbp]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RBP]))
10660
#ifdef CONFIG_X86_64
10661
- ,[r8 ]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R8])),
10662
- [r9 ]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R9 ])),
10663
- [r10]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R10])),
10664
- [r11]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R11])),
10665
- [r12]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R12])),
10666
- [r13]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R13])),
10667
- [r14]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R14])),
10668
- [r15]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R15]))
10669
+ , [r8]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R8])),
10670
+ [r9]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R9])),
10671
+ [r10]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R10])),
10672
+ [r11]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R11])),
10673
+ [r12]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R12])),
10674
+ [r13]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R13])),
10675
+ [r14]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R14])),
10676
+ [r15]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R15]))
10678
- : "cc", "memory" );
10680
+#ifdef CONFIG_X86_64
10681
+ , "rbx", "rcx", "rdx", "rsi", "rdi"
10682
+ , "r8", "r9", "r10", "r11" , "r12", "r13", "r14", "r15"
10684
+ , "ebx", "ecx", "edx" , "esi", "edi"
10688
if ((svm->vmcb->save.dr7 & 0xff))
10689
load_db_regs(svm->host_db_regs);
10690
@@ -1675,7 +1653,6 @@ svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
10691
hypercall[0] = 0x0f;
10692
hypercall[1] = 0x01;
10693
hypercall[2] = 0xd9;
10694
- hypercall[3] = 0xc3;
10697
static void svm_check_processor_compat(void *rtn)
10698
@@ -1737,17 +1714,19 @@ static struct kvm_x86_ops svm_x86_ops = {
10699
.set_irq = svm_set_irq,
10700
.inject_pending_irq = svm_intr_assist,
10701
.inject_pending_vectors = do_interrupt_requests,
10703
+ .set_tss_addr = svm_set_tss_addr,
10706
static int __init svm_init(void)
10708
- return kvm_init_x86(&svm_x86_ops, sizeof(struct vcpu_svm),
10709
+ return kvm_init(&svm_x86_ops, sizeof(struct vcpu_svm),
10713
static void __exit svm_exit(void)
10719
module_init(svm_init)
10720
diff --git a/drivers/kvm/svm.h b/drivers/kvm/svm.h
10721
index 3b1b0f3..5fa277c 100644
10722
--- a/drivers/kvm/svm.h
10723
+++ b/drivers/kvm/svm.h
10724
@@ -311,7 +311,7 @@ struct __attribute__ ((__packed__)) vmcb {
10726
#define SVM_EXIT_ERR -1
10728
-#define SVM_CR0_SELECTIVE_MASK (1 << 3 | 1) // TS and MP
10729
+#define SVM_CR0_SELECTIVE_MASK (1 << 3 | 1) /* TS and MP */
10731
#define SVM_VMLOAD ".byte 0x0f, 0x01, 0xda"
10732
#define SVM_VMRUN ".byte 0x0f, 0x01, 0xd8"
10733
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
10734
index bb56ae3..8e43feb 100644
10735
--- a/drivers/kvm/vmx.c
10736
+++ b/drivers/kvm/vmx.c
10742
#include "x86_emulate.h"
10746
#include <linux/mm.h>
10747
#include <linux/highmem.h>
10748
#include <linux/sched.h>
10749
+#include <linux/moduleparam.h>
10751
#include <asm/io.h>
10752
#include <asm/desc.h>
10754
MODULE_AUTHOR("Qumranet");
10755
MODULE_LICENSE("GPL");
10757
+static int bypass_guest_pf = 1;
10758
+module_param(bypass_guest_pf, bool, 0);
10763
@@ -43,6 +48,7 @@ struct vcpu_vmx {
10764
struct kvm_vcpu vcpu;
10767
+ u32 idt_vectoring_info;
10768
struct kvm_msr_entry *guest_msrs;
10769
struct kvm_msr_entry *host_msrs;
10771
@@ -57,8 +63,15 @@ struct vcpu_vmx {
10772
u16 fs_sel, gs_sel, ldt_sel;
10773
int gs_ldt_reload_needed;
10774
int fs_reload_needed;
10777
+ int guest_efer_loaded;
10788
static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu)
10789
@@ -74,14 +87,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
10790
static struct page *vmx_io_bitmap_a;
10791
static struct page *vmx_io_bitmap_b;
10793
-#define EFER_SAVE_RESTORE_BITS ((u64)EFER_SCE)
10795
static struct vmcs_config {
10799
u32 pin_based_exec_ctrl;
10800
u32 cpu_based_exec_ctrl;
10801
+ u32 cpu_based_2nd_exec_ctrl;
10805
@@ -138,18 +150,6 @@ static void save_msrs(struct kvm_msr_entry *e, int n)
10806
rdmsrl(e[i].index, e[i].data);
10809
-static inline u64 msr_efer_save_restore_bits(struct kvm_msr_entry msr)
10811
- return (u64)msr.data & EFER_SAVE_RESTORE_BITS;
10814
-static inline int msr_efer_need_save_restore(struct vcpu_vmx *vmx)
10816
- int efer_offset = vmx->msr_offset_efer;
10817
- return msr_efer_save_restore_bits(vmx->host_msrs[efer_offset]) !=
10818
- msr_efer_save_restore_bits(vmx->guest_msrs[efer_offset]);
10821
static inline int is_page_fault(u32 intr_info)
10823
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
10824
@@ -164,6 +164,13 @@ static inline int is_no_device(u32 intr_info)
10825
(INTR_TYPE_EXCEPTION | NM_VECTOR | INTR_INFO_VALID_MASK);
10828
+static inline int is_invalid_opcode(u32 intr_info)
10830
+ return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
10831
+ INTR_INFO_VALID_MASK)) ==
10832
+ (INTR_TYPE_EXCEPTION | UD_VECTOR | INTR_INFO_VALID_MASK);
10835
static inline int is_external_interrupt(u32 intr_info)
10837
return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
10838
@@ -180,6 +187,24 @@ static inline int vm_need_tpr_shadow(struct kvm *kvm)
10839
return ((cpu_has_vmx_tpr_shadow()) && (irqchip_in_kernel(kvm)));
10842
+static inline int cpu_has_secondary_exec_ctrls(void)
10844
+ return (vmcs_config.cpu_based_exec_ctrl &
10845
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS);
10848
+static inline int cpu_has_vmx_virtualize_apic_accesses(void)
10850
+ return (vmcs_config.cpu_based_2nd_exec_ctrl &
10851
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
10854
+static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm)
10856
+ return ((cpu_has_vmx_virtualize_apic_accesses()) &&
10857
+ (irqchip_in_kernel(kvm)));
10860
static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr)
10863
@@ -227,11 +252,9 @@ static void __vcpu_clear(void *arg)
10865
static void vcpu_clear(struct vcpu_vmx *vmx)
10867
- if (vmx->vcpu.cpu != raw_smp_processor_id() && vmx->vcpu.cpu != -1)
10868
- smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear,
10871
- __vcpu_clear(vmx);
10872
+ if (vmx->vcpu.cpu == -1)
10874
+ smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear, vmx, 0, 1);
10878
@@ -275,7 +298,7 @@ static void vmcs_writel(unsigned long field, unsigned long value)
10881
asm volatile (ASM_VMX_VMWRITE_RAX_RDX "; setna %0"
10882
- : "=q"(error) : "a"(value), "d"(field) : "cc" );
10883
+ : "=q"(error) : "a"(value), "d"(field) : "cc");
10884
if (unlikely(error))
10885
vmwrite_error(field, value);
10887
@@ -315,7 +338,7 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
10891
- eb = 1u << PF_VECTOR;
10892
+ eb = (1u << PF_VECTOR) | (1u << UD_VECTOR);
10893
if (!vcpu->fpu_active)
10894
eb |= 1u << NM_VECTOR;
10895
if (vcpu->guest_debug.enabled)
10896
@@ -344,16 +367,42 @@ static void reload_tss(void)
10898
static void load_transition_efer(struct vcpu_vmx *vmx)
10901
int efer_offset = vmx->msr_offset_efer;
10902
+ u64 host_efer = vmx->host_msrs[efer_offset].data;
10903
+ u64 guest_efer = vmx->guest_msrs[efer_offset].data;
10906
- trans_efer = vmx->host_msrs[efer_offset].data;
10907
- trans_efer &= ~EFER_SAVE_RESTORE_BITS;
10908
- trans_efer |= msr_efer_save_restore_bits(vmx->guest_msrs[efer_offset]);
10909
- wrmsrl(MSR_EFER, trans_efer);
10910
+ if (efer_offset < 0)
10913
+ * NX is emulated; LMA and LME handled by hardware; SCE meaninless
10914
+ * outside long mode
10916
+ ignore_bits = EFER_NX | EFER_SCE;
10917
+#ifdef CONFIG_X86_64
10918
+ ignore_bits |= EFER_LMA | EFER_LME;
10919
+ /* SCE is meaningful only in long mode on Intel */
10920
+ if (guest_efer & EFER_LMA)
10921
+ ignore_bits &= ~(u64)EFER_SCE;
10923
+ if ((guest_efer & ~ignore_bits) == (host_efer & ~ignore_bits))
10926
+ vmx->host_state.guest_efer_loaded = 1;
10927
+ guest_efer &= ~ignore_bits;
10928
+ guest_efer |= host_efer & ignore_bits;
10929
+ wrmsrl(MSR_EFER, guest_efer);
10930
vmx->vcpu.stat.efer_reload++;
10933
+static void reload_host_efer(struct vcpu_vmx *vmx)
10935
+ if (vmx->host_state.guest_efer_loaded) {
10936
+ vmx->host_state.guest_efer_loaded = 0;
10937
+ load_msrs(vmx->host_msrs + vmx->msr_offset_efer, 1);
10941
static void vmx_save_host_state(struct kvm_vcpu *vcpu)
10943
struct vcpu_vmx *vmx = to_vmx(vcpu);
10944
@@ -393,14 +442,13 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
10947
#ifdef CONFIG_X86_64
10948
- if (is_long_mode(&vmx->vcpu)) {
10949
+ if (is_long_mode(&vmx->vcpu))
10950
save_msrs(vmx->host_msrs +
10951
vmx->msr_offset_kernel_gs_base, 1);
10955
load_msrs(vmx->guest_msrs, vmx->save_nmsrs);
10956
- if (msr_efer_need_save_restore(vmx))
10957
- load_transition_efer(vmx);
10958
+ load_transition_efer(vmx);
10961
static void vmx_load_host_state(struct vcpu_vmx *vmx)
10962
@@ -410,6 +458,7 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
10963
if (!vmx->host_state.loaded)
10966
+ ++vmx->vcpu.stat.host_state_reload;
10967
vmx->host_state.loaded = 0;
10968
if (vmx->host_state.fs_reload_needed)
10969
load_fs(vmx->host_state.fs_sel);
10970
@@ -429,8 +478,7 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
10972
save_msrs(vmx->guest_msrs, vmx->save_nmsrs);
10973
load_msrs(vmx->host_msrs, vmx->save_nmsrs);
10974
- if (msr_efer_need_save_restore(vmx))
10975
- load_msrs(vmx->host_msrs + vmx->msr_offset_efer, 1);
10976
+ reload_host_efer(vmx);
10980
@@ -488,7 +536,6 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
10981
static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
10983
vmx_load_host_state(to_vmx(vcpu));
10984
- kvm_put_guest_fpu(vcpu);
10987
static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
10988
@@ -560,6 +607,14 @@ static void vmx_inject_gp(struct kvm_vcpu *vcpu, unsigned error_code)
10989
INTR_INFO_VALID_MASK);
10992
+static void vmx_inject_ud(struct kvm_vcpu *vcpu)
10994
+ vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
10996
+ INTR_TYPE_EXCEPTION |
10997
+ INTR_INFO_VALID_MASK);
11001
* Swap MSR entry in host/guest MSR entry array.
11003
@@ -712,8 +767,10 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
11004
#ifdef CONFIG_X86_64
11006
ret = kvm_set_msr_common(vcpu, msr_index, data);
11007
- if (vmx->host_state.loaded)
11008
+ if (vmx->host_state.loaded) {
11009
+ reload_host_efer(vmx);
11010
load_transition_efer(vmx);
11014
vmcs_writel(GUEST_FS_BASE, data);
11015
@@ -808,14 +865,15 @@ static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg)
11017
static int vmx_get_irq(struct kvm_vcpu *vcpu)
11019
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
11020
u32 idtv_info_field;
11022
- idtv_info_field = vmcs_read32(IDT_VECTORING_INFO_FIELD);
11023
+ idtv_info_field = vmx->idt_vectoring_info;
11024
if (idtv_info_field & INTR_INFO_VALID_MASK) {
11025
if (is_external_interrupt(idtv_info_field))
11026
return idtv_info_field & VECTORING_INFO_VECTOR_MASK;
11028
- printk("pending exception: not handled yet\n");
11029
+ printk(KERN_DEBUG "pending exception: not handled yet\n");
11033
@@ -863,7 +921,7 @@ static void hardware_disable(void *garbage)
11036
static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
11037
- u32 msr, u32* result)
11038
+ u32 msr, u32 *result)
11040
u32 vmx_msr_low, vmx_msr_high;
11041
u32 ctl = ctl_min | ctl_opt;
11042
@@ -887,6 +945,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
11044
u32 _pin_based_exec_control = 0;
11045
u32 _cpu_based_exec_control = 0;
11046
+ u32 _cpu_based_2nd_exec_control = 0;
11047
u32 _vmexit_control = 0;
11048
u32 _vmentry_control = 0;
11050
@@ -904,11 +963,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
11051
CPU_BASED_USE_IO_BITMAPS |
11052
CPU_BASED_MOV_DR_EXITING |
11053
CPU_BASED_USE_TSC_OFFSETING;
11054
-#ifdef CONFIG_X86_64
11055
- opt = CPU_BASED_TPR_SHADOW;
11059
+ opt = CPU_BASED_TPR_SHADOW |
11060
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
11061
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
11062
&_cpu_based_exec_control) < 0)
11064
@@ -917,6 +973,19 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
11065
_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
11066
~CPU_BASED_CR8_STORE_EXITING;
11068
+ if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
11070
+ opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
11071
+ SECONDARY_EXEC_WBINVD_EXITING;
11072
+ if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS2,
11073
+ &_cpu_based_2nd_exec_control) < 0)
11076
+#ifndef CONFIG_X86_64
11077
+ if (!(_cpu_based_2nd_exec_control &
11078
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
11079
+ _cpu_based_exec_control &= ~CPU_BASED_TPR_SHADOW;
11083
#ifdef CONFIG_X86_64
11084
@@ -954,6 +1023,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
11086
vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
11087
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
11088
+ vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
11089
vmcs_conf->vmexit_ctrl = _vmexit_control;
11090
vmcs_conf->vmentry_ctrl = _vmentry_control;
11092
@@ -1072,10 +1142,14 @@ static void enter_pmode(struct kvm_vcpu *vcpu)
11093
vmcs_write32(GUEST_CS_AR_BYTES, 0x9b);
11096
-static gva_t rmode_tss_base(struct kvm* kvm)
11097
+static gva_t rmode_tss_base(struct kvm *kvm)
11099
- gfn_t base_gfn = kvm->memslots[0].base_gfn + kvm->memslots[0].npages - 3;
11100
- return base_gfn << PAGE_SHIFT;
11101
+ if (!kvm->tss_addr) {
11102
+ gfn_t base_gfn = kvm->memslots[0].base_gfn +
11103
+ kvm->memslots[0].npages - 3;
11104
+ return base_gfn << PAGE_SHIFT;
11106
+ return kvm->tss_addr;
11109
static void fix_rmode_seg(int seg, struct kvm_save_segment *save)
11110
@@ -1086,7 +1160,8 @@ static void fix_rmode_seg(int seg, struct kvm_save_segment *save)
11111
save->base = vmcs_readl(sf->base);
11112
save->limit = vmcs_read32(sf->limit);
11113
save->ar = vmcs_read32(sf->ar_bytes);
11114
- vmcs_write16(sf->selector, vmcs_readl(sf->base) >> 4);
11115
+ vmcs_write16(sf->selector, save->base >> 4);
11116
+ vmcs_write32(sf->base, save->base & 0xfffff);
11117
vmcs_write32(sf->limit, 0xffff);
11118
vmcs_write32(sf->ar_bytes, 0xf3);
11120
@@ -1355,35 +1430,30 @@ static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct descriptor_table *dt)
11121
vmcs_writel(GUEST_GDTR_BASE, dt->base);
11124
-static int init_rmode_tss(struct kvm* kvm)
11125
+static int init_rmode_tss(struct kvm *kvm)
11127
- struct page *p1, *p2, *p3;
11128
gfn_t fn = rmode_tss_base(kvm) >> PAGE_SHIFT;
11131
- p1 = gfn_to_page(kvm, fn++);
11132
- p2 = gfn_to_page(kvm, fn++);
11133
- p3 = gfn_to_page(kvm, fn);
11137
- if (!p1 || !p2 || !p3) {
11138
- kvm_printf(kvm,"%s: gfn_to_page failed\n", __FUNCTION__);
11139
+ r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
11142
+ data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
11143
+ r = kvm_write_guest_page(kvm, fn++, &data, 0x66, sizeof(u16));
11146
+ r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE);
11149
+ r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
11153
+ r = kvm_write_guest_page(kvm, fn, &data, RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1,
11159
- page = kmap_atomic(p1, KM_USER0);
11160
- clear_page(page);
11161
- *(u16*)(page + 0x66) = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
11162
- kunmap_atomic(page, KM_USER0);
11164
- page = kmap_atomic(p2, KM_USER0);
11165
- clear_page(page);
11166
- kunmap_atomic(page, KM_USER0);
11168
- page = kmap_atomic(p3, KM_USER0);
11169
- clear_page(page);
11170
- *(page + RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1) = ~0;
11171
- kunmap_atomic(page, KM_USER0);
11176
@@ -1397,6 +1467,27 @@ static void seg_setup(int seg)
11177
vmcs_write32(sf->ar_bytes, 0x93);
11180
+static int alloc_apic_access_page(struct kvm *kvm)
11182
+ struct kvm_userspace_memory_region kvm_userspace_mem;
11185
+ mutex_lock(&kvm->lock);
11186
+ if (kvm->apic_access_page)
11188
+ kvm_userspace_mem.slot = APIC_ACCESS_PAGE_PRIVATE_MEMSLOT;
11189
+ kvm_userspace_mem.flags = 0;
11190
+ kvm_userspace_mem.guest_phys_addr = 0xfee00000ULL;
11191
+ kvm_userspace_mem.memory_size = PAGE_SIZE;
11192
+ r = __kvm_set_memory_region(kvm, &kvm_userspace_mem, 0);
11195
+ kvm->apic_access_page = gfn_to_page(kvm, 0xfee00);
11197
+ mutex_unlock(&kvm->lock);
11202
* Sets up the vmcs for emulated real mode.
11204
@@ -1407,92 +1498,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
11206
struct descriptor_table dt;
11209
unsigned long kvm_vmx_return;
11213
- if (!init_rmode_tss(vmx->vcpu.kvm)) {
11218
- vmx->vcpu.rmode.active = 0;
11220
- vmx->vcpu.regs[VCPU_REGS_RDX] = get_rdx_init_val();
11221
- set_cr8(&vmx->vcpu, 0);
11222
- msr = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
11223
- if (vmx->vcpu.vcpu_id == 0)
11224
- msr |= MSR_IA32_APICBASE_BSP;
11225
- kvm_set_apic_base(&vmx->vcpu, msr);
11227
- fx_init(&vmx->vcpu);
11230
- * GUEST_CS_BASE should really be 0xffff0000, but VT vm86 mode
11231
- * insists on having GUEST_CS_BASE == GUEST_CS_SELECTOR << 4. Sigh.
11233
- if (vmx->vcpu.vcpu_id == 0) {
11234
- vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
11235
- vmcs_writel(GUEST_CS_BASE, 0x000f0000);
11237
- vmcs_write16(GUEST_CS_SELECTOR, vmx->vcpu.sipi_vector << 8);
11238
- vmcs_writel(GUEST_CS_BASE, vmx->vcpu.sipi_vector << 12);
11240
- vmcs_write32(GUEST_CS_LIMIT, 0xffff);
11241
- vmcs_write32(GUEST_CS_AR_BYTES, 0x9b);
11243
- seg_setup(VCPU_SREG_DS);
11244
- seg_setup(VCPU_SREG_ES);
11245
- seg_setup(VCPU_SREG_FS);
11246
- seg_setup(VCPU_SREG_GS);
11247
- seg_setup(VCPU_SREG_SS);
11249
- vmcs_write16(GUEST_TR_SELECTOR, 0);
11250
- vmcs_writel(GUEST_TR_BASE, 0);
11251
- vmcs_write32(GUEST_TR_LIMIT, 0xffff);
11252
- vmcs_write32(GUEST_TR_AR_BYTES, 0x008b);
11254
- vmcs_write16(GUEST_LDTR_SELECTOR, 0);
11255
- vmcs_writel(GUEST_LDTR_BASE, 0);
11256
- vmcs_write32(GUEST_LDTR_LIMIT, 0xffff);
11257
- vmcs_write32(GUEST_LDTR_AR_BYTES, 0x00082);
11259
- vmcs_write32(GUEST_SYSENTER_CS, 0);
11260
- vmcs_writel(GUEST_SYSENTER_ESP, 0);
11261
- vmcs_writel(GUEST_SYSENTER_EIP, 0);
11263
- vmcs_writel(GUEST_RFLAGS, 0x02);
11264
- if (vmx->vcpu.vcpu_id == 0)
11265
- vmcs_writel(GUEST_RIP, 0xfff0);
11267
- vmcs_writel(GUEST_RIP, 0);
11268
- vmcs_writel(GUEST_RSP, 0);
11270
- //todo: dr0 = dr1 = dr2 = dr3 = 0; dr6 = 0xffff0ff0
11271
- vmcs_writel(GUEST_DR7, 0x400);
11273
- vmcs_writel(GUEST_GDTR_BASE, 0);
11274
- vmcs_write32(GUEST_GDTR_LIMIT, 0xffff);
11276
- vmcs_writel(GUEST_IDTR_BASE, 0);
11277
- vmcs_write32(GUEST_IDTR_LIMIT, 0xffff);
11279
- vmcs_write32(GUEST_ACTIVITY_STATE, 0);
11280
- vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
11281
- vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
11284
vmcs_write64(IO_BITMAP_A, page_to_phys(vmx_io_bitmap_a));
11285
vmcs_write64(IO_BITMAP_B, page_to_phys(vmx_io_bitmap_b));
11287
- guest_write_tsc(0);
11289
vmcs_write64(VMCS_LINK_POINTER, -1ull); /* 22.3.1.5 */
11291
- /* Special registers */
11292
- vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
11295
vmcs_write32(PIN_BASED_VM_EXEC_CONTROL,
11296
vmcs_config.pin_based_exec_ctrl);
11297
@@ -1507,8 +1521,16 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
11299
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control);
11301
- vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
11302
- vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
11303
+ if (cpu_has_secondary_exec_ctrls()) {
11304
+ exec_control = vmcs_config.cpu_based_2nd_exec_ctrl;
11305
+ if (!vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))
11307
+ ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
11308
+ vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
11311
+ vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, !!bypass_guest_pf);
11312
+ vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, !!bypass_guest_pf);
11313
vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */
11315
vmcs_writel(HOST_CR0, read_cr0()); /* 22.2.3 */
11316
@@ -1536,7 +1558,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
11318
vmcs_writel(HOST_IDTR_BASE, dt.base); /* 22.2.4 */
11320
- asm ("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
11321
+ asm("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
11322
vmcs_writel(HOST_RIP, kvm_vmx_return); /* 22.2.5 */
11323
vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
11324
vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0);
11325
@@ -1567,97 +1589,145 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
11331
vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
11333
/* 22.2.1, 20.8.1 */
11334
vmcs_write32(VM_ENTRY_CONTROLS, vmcs_config.vmentry_ctrl);
11336
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */
11338
-#ifdef CONFIG_X86_64
11339
- vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, 0);
11340
- if (vm_need_tpr_shadow(vmx->vcpu.kvm))
11341
- vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
11342
- page_to_phys(vmx->vcpu.apic->regs_page));
11343
- vmcs_write32(TPR_THRESHOLD, 0);
11346
vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
11347
vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
11349
- vmx->vcpu.cr0 = 0x60000010;
11350
- vmx_set_cr0(&vmx->vcpu, vmx->vcpu.cr0); // enter rmode
11351
- vmx_set_cr4(&vmx->vcpu, 0);
11352
-#ifdef CONFIG_X86_64
11353
- vmx_set_efer(&vmx->vcpu, 0);
11355
- vmx_fpu_activate(&vmx->vcpu);
11356
- update_exception_bitmap(&vmx->vcpu);
11357
+ if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))
11358
+ if (alloc_apic_access_page(vmx->vcpu.kvm) != 0)
11367
-static void vmx_vcpu_reset(struct kvm_vcpu *vcpu)
11368
+static int vmx_vcpu_reset(struct kvm_vcpu *vcpu)
11370
struct vcpu_vmx *vmx = to_vmx(vcpu);
11374
- vmx_vcpu_setup(vmx);
11377
-static void inject_rmode_irq(struct kvm_vcpu *vcpu, int irq)
11382
- unsigned long flags;
11383
- unsigned long ss_base = vmcs_readl(GUEST_SS_BASE);
11384
- u16 sp = vmcs_readl(GUEST_RSP);
11385
- u32 ss_limit = vmcs_read32(GUEST_SS_LIMIT);
11387
- if (sp > ss_limit || sp < 6 ) {
11388
- vcpu_printf(vcpu, "%s: #SS, rsp 0x%lx ss 0x%lx limit 0x%x\n",
11390
- vmcs_readl(GUEST_RSP),
11391
- vmcs_readl(GUEST_SS_BASE),
11392
- vmcs_read32(GUEST_SS_LIMIT));
11394
+ if (!init_rmode_tss(vmx->vcpu.kvm)) {
11399
- if (emulator_read_std(irq * sizeof(ent), &ent, sizeof(ent), vcpu) !=
11400
- X86EMUL_CONTINUE) {
11401
- vcpu_printf(vcpu, "%s: read guest err\n", __FUNCTION__);
11403
+ vmx->vcpu.rmode.active = 0;
11405
+ vmx->vcpu.regs[VCPU_REGS_RDX] = get_rdx_init_val();
11406
+ set_cr8(&vmx->vcpu, 0);
11407
+ msr = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
11408
+ if (vmx->vcpu.vcpu_id == 0)
11409
+ msr |= MSR_IA32_APICBASE_BSP;
11410
+ kvm_set_apic_base(&vmx->vcpu, msr);
11412
+ fx_init(&vmx->vcpu);
11415
+ * GUEST_CS_BASE should really be 0xffff0000, but VT vm86 mode
11416
+ * insists on having GUEST_CS_BASE == GUEST_CS_SELECTOR << 4. Sigh.
11418
+ if (vmx->vcpu.vcpu_id == 0) {
11419
+ vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
11420
+ vmcs_writel(GUEST_CS_BASE, 0x000f0000);
11422
+ vmcs_write16(GUEST_CS_SELECTOR, vmx->vcpu.sipi_vector << 8);
11423
+ vmcs_writel(GUEST_CS_BASE, vmx->vcpu.sipi_vector << 12);
11425
+ vmcs_write32(GUEST_CS_LIMIT, 0xffff);
11426
+ vmcs_write32(GUEST_CS_AR_BYTES, 0x9b);
11428
+ seg_setup(VCPU_SREG_DS);
11429
+ seg_setup(VCPU_SREG_ES);
11430
+ seg_setup(VCPU_SREG_FS);
11431
+ seg_setup(VCPU_SREG_GS);
11432
+ seg_setup(VCPU_SREG_SS);
11434
- flags = vmcs_readl(GUEST_RFLAGS);
11435
- cs = vmcs_readl(GUEST_CS_BASE) >> 4;
11436
- ip = vmcs_readl(GUEST_RIP);
11437
+ vmcs_write16(GUEST_TR_SELECTOR, 0);
11438
+ vmcs_writel(GUEST_TR_BASE, 0);
11439
+ vmcs_write32(GUEST_TR_LIMIT, 0xffff);
11440
+ vmcs_write32(GUEST_TR_AR_BYTES, 0x008b);
11442
+ vmcs_write16(GUEST_LDTR_SELECTOR, 0);
11443
+ vmcs_writel(GUEST_LDTR_BASE, 0);
11444
+ vmcs_write32(GUEST_LDTR_LIMIT, 0xffff);
11445
+ vmcs_write32(GUEST_LDTR_AR_BYTES, 0x00082);
11447
- if (emulator_write_emulated(ss_base + sp - 2, &flags, 2, vcpu) != X86EMUL_CONTINUE ||
11448
- emulator_write_emulated(ss_base + sp - 4, &cs, 2, vcpu) != X86EMUL_CONTINUE ||
11449
- emulator_write_emulated(ss_base + sp - 6, &ip, 2, vcpu) != X86EMUL_CONTINUE) {
11450
- vcpu_printf(vcpu, "%s: write guest err\n", __FUNCTION__);
11452
+ vmcs_write32(GUEST_SYSENTER_CS, 0);
11453
+ vmcs_writel(GUEST_SYSENTER_ESP, 0);
11454
+ vmcs_writel(GUEST_SYSENTER_EIP, 0);
11456
+ vmcs_writel(GUEST_RFLAGS, 0x02);
11457
+ if (vmx->vcpu.vcpu_id == 0)
11458
+ vmcs_writel(GUEST_RIP, 0xfff0);
11460
+ vmcs_writel(GUEST_RIP, 0);
11461
+ vmcs_writel(GUEST_RSP, 0);
11463
+ /* todo: dr0 = dr1 = dr2 = dr3 = 0; dr6 = 0xffff0ff0 */
11464
+ vmcs_writel(GUEST_DR7, 0x400);
11466
+ vmcs_writel(GUEST_GDTR_BASE, 0);
11467
+ vmcs_write32(GUEST_GDTR_LIMIT, 0xffff);
11469
+ vmcs_writel(GUEST_IDTR_BASE, 0);
11470
+ vmcs_write32(GUEST_IDTR_LIMIT, 0xffff);
11472
+ vmcs_write32(GUEST_ACTIVITY_STATE, 0);
11473
+ vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
11474
+ vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
11476
+ guest_write_tsc(0);
11478
+ /* Special registers */
11479
+ vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
11483
+ vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */
11485
+ if (cpu_has_vmx_tpr_shadow()) {
11486
+ vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, 0);
11487
+ if (vm_need_tpr_shadow(vmx->vcpu.kvm))
11488
+ vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
11489
+ page_to_phys(vmx->vcpu.apic->regs_page));
11490
+ vmcs_write32(TPR_THRESHOLD, 0);
11493
- vmcs_writel(GUEST_RFLAGS, flags &
11494
- ~( X86_EFLAGS_IF | X86_EFLAGS_AC | X86_EFLAGS_TF));
11495
- vmcs_write16(GUEST_CS_SELECTOR, ent[1]) ;
11496
- vmcs_writel(GUEST_CS_BASE, ent[1] << 4);
11497
- vmcs_writel(GUEST_RIP, ent[0]);
11498
- vmcs_writel(GUEST_RSP, (vmcs_readl(GUEST_RSP) & ~0xffff) | (sp - 6));
11499
+ if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))
11500
+ vmcs_write64(APIC_ACCESS_ADDR,
11501
+ page_to_phys(vmx->vcpu.kvm->apic_access_page));
11503
+ vmx->vcpu.cr0 = 0x60000010;
11504
+ vmx_set_cr0(&vmx->vcpu, vmx->vcpu.cr0); /* enter rmode */
11505
+ vmx_set_cr4(&vmx->vcpu, 0);
11506
+#ifdef CONFIG_X86_64
11507
+ vmx_set_efer(&vmx->vcpu, 0);
11509
+ vmx_fpu_activate(&vmx->vcpu);
11510
+ update_exception_bitmap(&vmx->vcpu);
11518
static void vmx_inject_irq(struct kvm_vcpu *vcpu, int irq)
11520
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
11522
if (vcpu->rmode.active) {
11523
- inject_rmode_irq(vcpu, irq);
11524
+ vmx->rmode.irq.pending = true;
11525
+ vmx->rmode.irq.vector = irq;
11526
+ vmx->rmode.irq.rip = vmcs_readl(GUEST_RIP);
11527
+ vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
11528
+ irq | INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK);
11529
+ vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1);
11530
+ vmcs_writel(GUEST_RIP, vmx->rmode.irq.rip - 1);
11533
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
11534
@@ -1706,6 +1776,23 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
11535
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
11538
+static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
11541
+ struct kvm_userspace_memory_region tss_mem = {
11543
+ .guest_phys_addr = addr,
11544
+ .memory_size = PAGE_SIZE * 3,
11548
+ ret = kvm_set_memory_region(kvm, &tss_mem, 0);
11551
+ kvm->tss_addr = addr;
11555
static void kvm_guest_debug_pre(struct kvm_vcpu *vcpu)
11557
struct kvm_guest_debug *dbg = &vcpu->guest_debug;
11558
@@ -1735,27 +1822,26 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
11559
* Cause the #SS fault with 0 error code in VM86 mode.
11561
if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0)
11562
- if (emulate_instruction(vcpu, NULL, 0, 0) == EMULATE_DONE)
11563
+ if (emulate_instruction(vcpu, NULL, 0, 0, 0) == EMULATE_DONE)
11568
static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11570
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
11571
u32 intr_info, error_code;
11572
unsigned long cr2, rip;
11574
enum emulation_result er;
11577
- vect_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
11578
+ vect_info = vmx->idt_vectoring_info;
11579
intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
11581
if ((vect_info & VECTORING_INFO_VALID_MASK) &&
11582
- !is_page_fault(intr_info)) {
11583
+ !is_page_fault(intr_info))
11584
printk(KERN_ERR "%s: unexpected, vectoring info 0x%x "
11585
"intr info 0x%x\n", __FUNCTION__, vect_info, intr_info);
11588
if (!irqchip_in_kernel(vcpu->kvm) && is_external_interrupt(vect_info)) {
11589
int irq = vect_info & VECTORING_INFO_VECTOR_MASK;
11590
@@ -1771,39 +1857,21 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11594
+ if (is_invalid_opcode(intr_info)) {
11595
+ er = emulate_instruction(vcpu, kvm_run, 0, 0, 0);
11596
+ if (er != EMULATE_DONE)
11597
+ vmx_inject_ud(vcpu);
11603
rip = vmcs_readl(GUEST_RIP);
11604
if (intr_info & INTR_INFO_DELIEVER_CODE_MASK)
11605
error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE);
11606
if (is_page_fault(intr_info)) {
11607
cr2 = vmcs_readl(EXIT_QUALIFICATION);
11609
- mutex_lock(&vcpu->kvm->lock);
11610
- r = kvm_mmu_page_fault(vcpu, cr2, error_code);
11612
- mutex_unlock(&vcpu->kvm->lock);
11616
- mutex_unlock(&vcpu->kvm->lock);
11620
- er = emulate_instruction(vcpu, kvm_run, cr2, error_code);
11621
- mutex_unlock(&vcpu->kvm->lock);
11624
- case EMULATE_DONE:
11626
- case EMULATE_DO_MMIO:
11627
- ++vcpu->stat.mmio_exits;
11629
- case EMULATE_FAIL:
11630
- kvm_report_emulation_failure(vcpu, "pagetable");
11635
+ return kvm_mmu_page_fault(vcpu, cr2, error_code);
11638
if (vcpu->rmode.active &&
11639
@@ -1816,7 +1884,8 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11643
- if ((intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK)) == (INTR_TYPE_EXCEPTION | 1)) {
11644
+ if ((intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK)) ==
11645
+ (INTR_TYPE_EXCEPTION | 1)) {
11646
kvm_run->exit_reason = KVM_EXIT_DEBUG;
11649
@@ -1850,7 +1919,8 @@ static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11650
string = (exit_qualification & 16) != 0;
11653
- if (emulate_instruction(vcpu, kvm_run, 0, 0) == EMULATE_DO_MMIO)
11654
+ if (emulate_instruction(vcpu,
11655
+ kvm_run, 0, 0, 0) == EMULATE_DO_MMIO)
11659
@@ -1873,7 +1943,6 @@ vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
11660
hypercall[0] = 0x0f;
11661
hypercall[1] = 0x01;
11662
hypercall[2] = 0xc1;
11663
- hypercall[3] = 0xc3;
11666
static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11667
@@ -2059,7 +2128,35 @@ static int handle_halt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11668
static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11670
skip_emulated_instruction(vcpu);
11671
- return kvm_hypercall(vcpu, kvm_run);
11672
+ kvm_emulate_hypercall(vcpu);
11676
+static int handle_wbinvd(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11678
+ skip_emulated_instruction(vcpu);
11679
+ /* TODO: Add support for VT-d/pass-through device */
11683
+static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11685
+ u64 exit_qualification;
11686
+ enum emulation_result er;
11687
+ unsigned long offset;
11689
+ exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
11690
+ offset = exit_qualification & 0xffful;
11692
+ er = emulate_instruction(vcpu, kvm_run, 0, 0, 0);
11694
+ if (er != EMULATE_DONE) {
11696
+ "Fail to handle apic access vmexit! Offset is 0x%lx\n",
11698
+ return -ENOTSUPP;
11704
@@ -2081,7 +2178,9 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu,
11705
[EXIT_REASON_PENDING_INTERRUPT] = handle_interrupt_window,
11706
[EXIT_REASON_HLT] = handle_halt,
11707
[EXIT_REASON_VMCALL] = handle_vmcall,
11708
- [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold
11709
+ [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
11710
+ [EXIT_REASON_APIC_ACCESS] = handle_apic_access,
11711
+ [EXIT_REASON_WBINVD] = handle_wbinvd,
11714
static const int kvm_vmx_max_exit_handlers =
11715
@@ -2093,9 +2192,9 @@ static const int kvm_vmx_max_exit_handlers =
11717
static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
11719
- u32 vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
11720
u32 exit_reason = vmcs_read32(VM_EXIT_REASON);
11721
struct vcpu_vmx *vmx = to_vmx(vcpu);
11722
+ u32 vectoring_info = vmx->idt_vectoring_info;
11724
if (unlikely(vmx->fail)) {
11725
kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
11726
@@ -2104,8 +2203,8 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
11730
- if ( (vectoring_info & VECTORING_INFO_VALID_MASK) &&
11731
- exit_reason != EXIT_REASON_EXCEPTION_NMI )
11732
+ if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
11733
+ exit_reason != EXIT_REASON_EXCEPTION_NMI)
11734
printk(KERN_WARNING "%s: unexpected, valid vectoring info and "
11735
"exit reason is 0x%x\n", __FUNCTION__, exit_reason);
11736
if (exit_reason < kvm_vmx_max_exit_handlers
11737
@@ -2150,16 +2249,16 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
11739
static void vmx_intr_assist(struct kvm_vcpu *vcpu)
11741
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
11742
u32 idtv_info_field, intr_info_field;
11743
int has_ext_irq, interrupt_window_open;
11746
- kvm_inject_pending_timer_irqs(vcpu);
11747
update_tpr_threshold(vcpu);
11749
has_ext_irq = kvm_cpu_has_interrupt(vcpu);
11750
intr_info_field = vmcs_read32(VM_ENTRY_INTR_INFO_FIELD);
11751
- idtv_info_field = vmcs_read32(IDT_VECTORING_INFO_FIELD);
11752
+ idtv_info_field = vmx->idt_vectoring_info;
11753
if (intr_info_field & INTR_INFO_VALID_MASK) {
11754
if (idtv_info_field & INTR_INFO_VALID_MASK) {
11755
/* TODO: fault when IDT_Vectoring */
11756
@@ -2170,6 +2269,17 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
11759
if (unlikely(idtv_info_field & INTR_INFO_VALID_MASK)) {
11760
+ if ((idtv_info_field & VECTORING_INFO_TYPE_MASK)
11761
+ == INTR_TYPE_EXT_INTR
11762
+ && vcpu->rmode.active) {
11763
+ u8 vect = idtv_info_field & VECTORING_INFO_VECTOR_MASK;
11765
+ vmx_inject_irq(vcpu, vect);
11766
+ if (unlikely(has_ext_irq))
11767
+ enable_irq_window(vcpu);
11771
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, idtv_info_field);
11772
vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
11773
vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
11774
@@ -2194,6 +2304,29 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
11775
enable_irq_window(vcpu);
11779
+ * Failure to inject an interrupt should give us the information
11780
+ * in IDT_VECTORING_INFO_FIELD. However, if the failure occurs
11781
+ * when fetching the interrupt redirection bitmap in the real-mode
11782
+ * tss, this doesn't happen. So we do it ourselves.
11784
+static void fixup_rmode_irq(struct vcpu_vmx *vmx)
11786
+ vmx->rmode.irq.pending = 0;
11787
+ if (vmcs_readl(GUEST_RIP) + 1 != vmx->rmode.irq.rip)
11789
+ vmcs_writel(GUEST_RIP, vmx->rmode.irq.rip);
11790
+ if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
11791
+ vmx->idt_vectoring_info &= ~VECTORING_INFO_TYPE_MASK;
11792
+ vmx->idt_vectoring_info |= INTR_TYPE_EXT_INTR;
11795
+ vmx->idt_vectoring_info =
11796
+ VECTORING_INFO_VALID_MASK
11797
+ | INTR_TYPE_EXT_INTR
11798
+ | vmx->rmode.irq.vector;
11801
static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11803
struct vcpu_vmx *vmx = to_vmx(vcpu);
11804
@@ -2204,50 +2337,47 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11806
vmcs_writel(HOST_CR0, read_cr0());
11810
/* Store host registers */
11811
#ifdef CONFIG_X86_64
11812
- "push %%rax; push %%rbx; push %%rdx;"
11813
- "push %%rsi; push %%rdi; push %%rbp;"
11814
- "push %%r8; push %%r9; push %%r10; push %%r11;"
11815
- "push %%r12; push %%r13; push %%r14; push %%r15;"
11816
+ "push %%rdx; push %%rbp;"
11818
- ASM_VMX_VMWRITE_RSP_RDX "\n\t"
11820
- "pusha; push %%ecx \n\t"
11821
- ASM_VMX_VMWRITE_RSP_RDX "\n\t"
11822
+ "push %%edx; push %%ebp;"
11823
+ "push %%ecx \n\t"
11825
+ ASM_VMX_VMWRITE_RSP_RDX "\n\t"
11826
/* Check if vmlaunch of vmresume is needed */
11827
- "cmp $0, %1 \n\t"
11828
+ "cmpl $0, %c[launched](%0) \n\t"
11829
/* Load guest registers. Don't clobber flags. */
11830
#ifdef CONFIG_X86_64
11831
- "mov %c[cr2](%3), %%rax \n\t"
11832
+ "mov %c[cr2](%0), %%rax \n\t"
11833
"mov %%rax, %%cr2 \n\t"
11834
- "mov %c[rax](%3), %%rax \n\t"
11835
- "mov %c[rbx](%3), %%rbx \n\t"
11836
- "mov %c[rdx](%3), %%rdx \n\t"
11837
- "mov %c[rsi](%3), %%rsi \n\t"
11838
- "mov %c[rdi](%3), %%rdi \n\t"
11839
- "mov %c[rbp](%3), %%rbp \n\t"
11840
- "mov %c[r8](%3), %%r8 \n\t"
11841
- "mov %c[r9](%3), %%r9 \n\t"
11842
- "mov %c[r10](%3), %%r10 \n\t"
11843
- "mov %c[r11](%3), %%r11 \n\t"
11844
- "mov %c[r12](%3), %%r12 \n\t"
11845
- "mov %c[r13](%3), %%r13 \n\t"
11846
- "mov %c[r14](%3), %%r14 \n\t"
11847
- "mov %c[r15](%3), %%r15 \n\t"
11848
- "mov %c[rcx](%3), %%rcx \n\t" /* kills %3 (rcx) */
11849
+ "mov %c[rax](%0), %%rax \n\t"
11850
+ "mov %c[rbx](%0), %%rbx \n\t"
11851
+ "mov %c[rdx](%0), %%rdx \n\t"
11852
+ "mov %c[rsi](%0), %%rsi \n\t"
11853
+ "mov %c[rdi](%0), %%rdi \n\t"
11854
+ "mov %c[rbp](%0), %%rbp \n\t"
11855
+ "mov %c[r8](%0), %%r8 \n\t"
11856
+ "mov %c[r9](%0), %%r9 \n\t"
11857
+ "mov %c[r10](%0), %%r10 \n\t"
11858
+ "mov %c[r11](%0), %%r11 \n\t"
11859
+ "mov %c[r12](%0), %%r12 \n\t"
11860
+ "mov %c[r13](%0), %%r13 \n\t"
11861
+ "mov %c[r14](%0), %%r14 \n\t"
11862
+ "mov %c[r15](%0), %%r15 \n\t"
11863
+ "mov %c[rcx](%0), %%rcx \n\t" /* kills %0 (rcx) */
11865
- "mov %c[cr2](%3), %%eax \n\t"
11866
+ "mov %c[cr2](%0), %%eax \n\t"
11867
"mov %%eax, %%cr2 \n\t"
11868
- "mov %c[rax](%3), %%eax \n\t"
11869
- "mov %c[rbx](%3), %%ebx \n\t"
11870
- "mov %c[rdx](%3), %%edx \n\t"
11871
- "mov %c[rsi](%3), %%esi \n\t"
11872
- "mov %c[rdi](%3), %%edi \n\t"
11873
- "mov %c[rbp](%3), %%ebp \n\t"
11874
- "mov %c[rcx](%3), %%ecx \n\t" /* kills %3 (ecx) */
11875
+ "mov %c[rax](%0), %%eax \n\t"
11876
+ "mov %c[rbx](%0), %%ebx \n\t"
11877
+ "mov %c[rdx](%0), %%edx \n\t"
11878
+ "mov %c[rsi](%0), %%esi \n\t"
11879
+ "mov %c[rdi](%0), %%edi \n\t"
11880
+ "mov %c[rbp](%0), %%ebp \n\t"
11881
+ "mov %c[rcx](%0), %%ecx \n\t" /* kills %0 (ecx) */
11883
/* Enter guest mode */
11884
"jne .Llaunched \n\t"
11885
@@ -2257,72 +2387,79 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
11886
".Lkvm_vmx_return: "
11887
/* Save guest registers, load host registers, keep flags */
11888
#ifdef CONFIG_X86_64
11889
- "xchg %3, (%%rsp) \n\t"
11890
- "mov %%rax, %c[rax](%3) \n\t"
11891
- "mov %%rbx, %c[rbx](%3) \n\t"
11892
- "pushq (%%rsp); popq %c[rcx](%3) \n\t"
11893
- "mov %%rdx, %c[rdx](%3) \n\t"
11894
- "mov %%rsi, %c[rsi](%3) \n\t"
11895
- "mov %%rdi, %c[rdi](%3) \n\t"
11896
- "mov %%rbp, %c[rbp](%3) \n\t"
11897
- "mov %%r8, %c[r8](%3) \n\t"
11898
- "mov %%r9, %c[r9](%3) \n\t"
11899
- "mov %%r10, %c[r10](%3) \n\t"
11900
- "mov %%r11, %c[r11](%3) \n\t"
11901
- "mov %%r12, %c[r12](%3) \n\t"
11902
- "mov %%r13, %c[r13](%3) \n\t"
11903
- "mov %%r14, %c[r14](%3) \n\t"
11904
- "mov %%r15, %c[r15](%3) \n\t"
11905
+ "xchg %0, (%%rsp) \n\t"
11906
+ "mov %%rax, %c[rax](%0) \n\t"
11907
+ "mov %%rbx, %c[rbx](%0) \n\t"
11908
+ "pushq (%%rsp); popq %c[rcx](%0) \n\t"
11909
+ "mov %%rdx, %c[rdx](%0) \n\t"
11910
+ "mov %%rsi, %c[rsi](%0) \n\t"
11911
+ "mov %%rdi, %c[rdi](%0) \n\t"
11912
+ "mov %%rbp, %c[rbp](%0) \n\t"
11913
+ "mov %%r8, %c[r8](%0) \n\t"
11914
+ "mov %%r9, %c[r9](%0) \n\t"
11915
+ "mov %%r10, %c[r10](%0) \n\t"
11916
+ "mov %%r11, %c[r11](%0) \n\t"
11917
+ "mov %%r12, %c[r12](%0) \n\t"
11918
+ "mov %%r13, %c[r13](%0) \n\t"
11919
+ "mov %%r14, %c[r14](%0) \n\t"
11920
+ "mov %%r15, %c[r15](%0) \n\t"
11921
"mov %%cr2, %%rax \n\t"
11922
- "mov %%rax, %c[cr2](%3) \n\t"
11923
- "mov (%%rsp), %3 \n\t"
11924
+ "mov %%rax, %c[cr2](%0) \n\t"
11926
- "pop %%rcx; pop %%r15; pop %%r14; pop %%r13; pop %%r12;"
11927
- "pop %%r11; pop %%r10; pop %%r9; pop %%r8;"
11928
- "pop %%rbp; pop %%rdi; pop %%rsi;"
11929
- "pop %%rdx; pop %%rbx; pop %%rax \n\t"
11930
+ "pop %%rbp; pop %%rbp; pop %%rdx \n\t"
11932
- "xchg %3, (%%esp) \n\t"
11933
- "mov %%eax, %c[rax](%3) \n\t"
11934
- "mov %%ebx, %c[rbx](%3) \n\t"
11935
- "pushl (%%esp); popl %c[rcx](%3) \n\t"
11936
- "mov %%edx, %c[rdx](%3) \n\t"
11937
- "mov %%esi, %c[rsi](%3) \n\t"
11938
- "mov %%edi, %c[rdi](%3) \n\t"
11939
- "mov %%ebp, %c[rbp](%3) \n\t"
11940
+ "xchg %0, (%%esp) \n\t"
11941
+ "mov %%eax, %c[rax](%0) \n\t"
11942
+ "mov %%ebx, %c[rbx](%0) \n\t"
11943
+ "pushl (%%esp); popl %c[rcx](%0) \n\t"
11944
+ "mov %%edx, %c[rdx](%0) \n\t"
11945
+ "mov %%esi, %c[rsi](%0) \n\t"
11946
+ "mov %%edi, %c[rdi](%0) \n\t"
11947
+ "mov %%ebp, %c[rbp](%0) \n\t"
11948
"mov %%cr2, %%eax \n\t"
11949
- "mov %%eax, %c[cr2](%3) \n\t"
11950
- "mov (%%esp), %3 \n\t"
11951
+ "mov %%eax, %c[cr2](%0) \n\t"
11953
- "pop %%ecx; popa \n\t"
11954
+ "pop %%ebp; pop %%ebp; pop %%edx \n\t"
11957
- : "=q" (vmx->fail)
11958
- : "r"(vmx->launched), "d"((unsigned long)HOST_RSP),
11960
- [rax]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RAX])),
11961
- [rbx]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RBX])),
11962
- [rcx]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RCX])),
11963
- [rdx]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RDX])),
11964
- [rsi]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RSI])),
11965
- [rdi]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RDI])),
11966
- [rbp]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RBP])),
11967
+ "setbe %c[fail](%0) \n\t"
11968
+ : : "c"(vmx), "d"((unsigned long)HOST_RSP),
11969
+ [launched]"i"(offsetof(struct vcpu_vmx, launched)),
11970
+ [fail]"i"(offsetof(struct vcpu_vmx, fail)),
11971
+ [rax]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RAX])),
11972
+ [rbx]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RBX])),
11973
+ [rcx]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RCX])),
11974
+ [rdx]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RDX])),
11975
+ [rsi]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RSI])),
11976
+ [rdi]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RDI])),
11977
+ [rbp]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_RBP])),
11978
#ifdef CONFIG_X86_64
11979
- [r8 ]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R8 ])),
11980
- [r9 ]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R9 ])),
11981
- [r10]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R10])),
11982
- [r11]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R11])),
11983
- [r12]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R12])),
11984
- [r13]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R13])),
11985
- [r14]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R14])),
11986
- [r15]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R15])),
11987
+ [r8]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R8])),
11988
+ [r9]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R9])),
11989
+ [r10]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R10])),
11990
+ [r11]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R11])),
11991
+ [r12]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R12])),
11992
+ [r13]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R13])),
11993
+ [r14]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R14])),
11994
+ [r15]"i"(offsetof(struct vcpu_vmx, vcpu.regs[VCPU_REGS_R15])),
11996
- [cr2]"i"(offsetof(struct kvm_vcpu, cr2))
11997
- : "cc", "memory" );
11998
+ [cr2]"i"(offsetof(struct vcpu_vmx, vcpu.cr2))
12000
+#ifdef CONFIG_X86_64
12001
+ , "rbx", "rdi", "rsi"
12002
+ , "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15"
12004
+ , "ebx", "edi", "rsi"
12008
- vcpu->interrupt_window_open = (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
12009
+ vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
12010
+ if (vmx->rmode.irq.pending)
12011
+ fixup_rmode_irq(vmx);
12013
+ vcpu->interrupt_window_open =
12014
+ (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
12016
- asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
12017
+ asm("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
12020
intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
12021
@@ -2336,7 +2473,8 @@ static void vmx_inject_page_fault(struct kvm_vcpu *vcpu,
12022
unsigned long addr,
12025
- u32 vect_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
12026
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
12027
+ u32 vect_info = vmx->idt_vectoring_info;
12029
++vcpu->stat.pf_guest;
12031
@@ -2397,12 +2535,6 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
12035
- if (irqchip_in_kernel(kvm)) {
12036
- err = kvm_create_lapic(&vmx->vcpu);
12041
vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL);
12042
if (!vmx->guest_msrs) {
12044
@@ -2511,6 +2643,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
12045
.set_irq = vmx_inject_irq,
12046
.inject_pending_irq = vmx_intr_assist,
12047
.inject_pending_vectors = do_interrupt_requests,
12049
+ .set_tss_addr = vmx_set_tss_addr,
12052
static int __init vmx_init(void)
12053
@@ -2541,10 +2675,13 @@ static int __init vmx_init(void)
12054
memset(iova, 0xff, PAGE_SIZE);
12055
kunmap(vmx_io_bitmap_b);
12057
- r = kvm_init_x86(&vmx_x86_ops, sizeof(struct vcpu_vmx), THIS_MODULE);
12058
+ r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx), THIS_MODULE);
12062
+ if (bypass_guest_pf)
12063
+ kvm_mmu_set_nonpresent_ptes(~0xffeull, 0ull);
12068
@@ -2559,7 +2696,7 @@ static void __exit vmx_exit(void)
12069
__free_page(vmx_io_bitmap_b);
12070
__free_page(vmx_io_bitmap_a);
12076
module_init(vmx_init)
12077
diff --git a/drivers/kvm/vmx.h b/drivers/kvm/vmx.h
12078
index fd4e146..d52ae8d 100644
12079
--- a/drivers/kvm/vmx.h
12080
+++ b/drivers/kvm/vmx.h
12086
+ * Definitions of Primary Processor-Based VM-Execution Controls.
12088
#define CPU_BASED_VIRTUAL_INTR_PENDING 0x00000004
12089
#define CPU_BASED_USE_TSC_OFFSETING 0x00000008
12090
#define CPU_BASED_HLT_EXITING 0x00000080
12092
#define CPU_BASED_MONITOR_EXITING 0x20000000
12093
#define CPU_BASED_PAUSE_EXITING 0x40000000
12094
#define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
12096
+ * Definitions of Secondary Processor-Based VM-Execution Controls.
12098
+#define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001
12099
+#define SECONDARY_EXEC_WBINVD_EXITING 0x00000040
12102
#define PIN_BASED_EXT_INTR_MASK 0x00000001
12103
#define PIN_BASED_NMI_EXITING 0x00000008
12105
#define VM_ENTRY_SMM 0x00000400
12106
#define VM_ENTRY_DEACT_DUAL_MONITOR 0x00000800
12108
-#define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001
12110
/* VMCS Encodings */
12112
GUEST_ES_SELECTOR = 0x00000800,
12113
@@ -89,6 +96,8 @@ enum vmcs_field {
12114
TSC_OFFSET_HIGH = 0x00002011,
12115
VIRTUAL_APIC_PAGE_ADDR = 0x00002012,
12116
VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013,
12117
+ APIC_ACCESS_ADDR = 0x00002014,
12118
+ APIC_ACCESS_ADDR_HIGH = 0x00002015,
12119
VMCS_LINK_POINTER = 0x00002800,
12120
VMCS_LINK_POINTER_HIGH = 0x00002801,
12121
GUEST_IA32_DEBUGCTL = 0x00002802,
12122
@@ -214,6 +223,8 @@ enum vmcs_field {
12123
#define EXIT_REASON_MSR_WRITE 32
12124
#define EXIT_REASON_MWAIT_INSTRUCTION 36
12125
#define EXIT_REASON_TPR_BELOW_THRESHOLD 43
12126
+#define EXIT_REASON_APIC_ACCESS 44
12127
+#define EXIT_REASON_WBINVD 54
12130
* Interruption-information format
12131
@@ -230,13 +241,14 @@ enum vmcs_field {
12133
#define INTR_TYPE_EXT_INTR (0 << 8) /* external interrupt */
12134
#define INTR_TYPE_EXCEPTION (3 << 8) /* processor exception */
12135
+#define INTR_TYPE_SOFT_INTR (4 << 8) /* software interrupt */
12138
* Exit Qualifications for MOV for Control Register Access
12140
-#define CONTROL_REG_ACCESS_NUM 0x7 /* 2:0, number of control register */
12141
+#define CONTROL_REG_ACCESS_NUM 0x7 /* 2:0, number of control reg.*/
12142
#define CONTROL_REG_ACCESS_TYPE 0x30 /* 5:4, access type */
12143
-#define CONTROL_REG_ACCESS_REG 0xf00 /* 10:8, general purpose register */
12144
+#define CONTROL_REG_ACCESS_REG 0xf00 /* 10:8, general purpose reg. */
12145
#define LMSW_SOURCE_DATA_SHIFT 16
12146
#define LMSW_SOURCE_DATA (0xFFFF << LMSW_SOURCE_DATA_SHIFT) /* 16:31 lmsw source */
12147
#define REG_EAX (0 << 8)
12148
@@ -259,11 +271,11 @@ enum vmcs_field {
12150
* Exit Qualifications for MOV for Debug Register Access
12152
-#define DEBUG_REG_ACCESS_NUM 0x7 /* 2:0, number of debug register */
12153
+#define DEBUG_REG_ACCESS_NUM 0x7 /* 2:0, number of debug reg. */
12154
#define DEBUG_REG_ACCESS_TYPE 0x10 /* 4, direction of access */
12155
#define TYPE_MOV_TO_DR (0 << 4)
12156
#define TYPE_MOV_FROM_DR (1 << 4)
12157
-#define DEBUG_REG_ACCESS_REG 0xf00 /* 11:8, general purpose register */
12158
+#define DEBUG_REG_ACCESS_REG 0xf00 /* 11:8, general purpose reg. */
12162
@@ -307,4 +319,6 @@ enum vmcs_field {
12163
#define MSR_IA32_FEATURE_CONTROL_LOCKED 0x1
12164
#define MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED 0x4
12166
+#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT 9
12169
diff --git a/drivers/kvm/x86.c b/drivers/kvm/x86.c
12170
new file mode 100644
12171
index 0000000..c9e4b67
12173
+++ b/drivers/kvm/x86.c
12176
+ * Kernel-based Virtual Machine driver for Linux
12178
+ * derived from drivers/kvm/kvm_main.c
12180
+ * Copyright (C) 2006 Qumranet, Inc.
12183
+ * Avi Kivity <avi@qumranet.com>
12184
+ * Yaniv Kamay <yaniv@qumranet.com>
12186
+ * This work is licensed under the terms of the GNU GPL, version 2. See
12187
+ * the COPYING file in the top-level directory.
12193
+#include "x86_emulate.h"
12194
+#include "segment_descriptor.h"
12197
+#include <linux/kvm.h>
12198
+#include <linux/fs.h>
12199
+#include <linux/vmalloc.h>
12200
+#include <linux/module.h>
12201
+#include <linux/mman.h>
12203
+#include <asm/uaccess.h>
12204
+#include <asm/msr.h>
12206
+#define MAX_IO_MSRS 256
12207
+#define CR0_RESERVED_BITS \
12208
+ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
12209
+ | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
12210
+ | X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))
12211
+#define CR4_RESERVED_BITS \
12212
+ (~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
12213
+ | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
12214
+ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR \
12215
+ | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
12217
+#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
12218
+#define EFER_RESERVED_BITS 0xfffffffffffff2fe
12220
+#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
12221
+#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
12223
+struct kvm_x86_ops *kvm_x86_ops;
12225
+struct kvm_stats_debugfs_item debugfs_entries[] = {
12226
+ { "pf_fixed", VCPU_STAT(pf_fixed) },
12227
+ { "pf_guest", VCPU_STAT(pf_guest) },
12228
+ { "tlb_flush", VCPU_STAT(tlb_flush) },
12229
+ { "invlpg", VCPU_STAT(invlpg) },
12230
+ { "exits", VCPU_STAT(exits) },
12231
+ { "io_exits", VCPU_STAT(io_exits) },
12232
+ { "mmio_exits", VCPU_STAT(mmio_exits) },
12233
+ { "signal_exits", VCPU_STAT(signal_exits) },
12234
+ { "irq_window", VCPU_STAT(irq_window_exits) },
12235
+ { "halt_exits", VCPU_STAT(halt_exits) },
12236
+ { "halt_wakeup", VCPU_STAT(halt_wakeup) },
12237
+ { "request_irq", VCPU_STAT(request_irq_exits) },
12238
+ { "irq_exits", VCPU_STAT(irq_exits) },
12239
+ { "host_state_reload", VCPU_STAT(host_state_reload) },
12240
+ { "efer_reload", VCPU_STAT(efer_reload) },
12241
+ { "fpu_reload", VCPU_STAT(fpu_reload) },
12242
+ { "insn_emulation", VCPU_STAT(insn_emulation) },
12243
+ { "insn_emulation_fail", VCPU_STAT(insn_emulation_fail) },
12244
+ { "mmu_shadow_zapped", VM_STAT(mmu_shadow_zapped) },
12245
+ { "mmu_pte_write", VM_STAT(mmu_pte_write) },
12246
+ { "mmu_pte_updated", VM_STAT(mmu_pte_updated) },
12247
+ { "mmu_pde_zapped", VM_STAT(mmu_pde_zapped) },
12248
+ { "mmu_flooded", VM_STAT(mmu_flooded) },
12249
+ { "mmu_recycled", VM_STAT(mmu_recycled) },
12250
+ { "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
12255
+unsigned long segment_base(u16 selector)
12257
+ struct descriptor_table gdt;
12258
+ struct segment_descriptor *d;
12259
+ unsigned long table_base;
12262
+ if (selector == 0)
12265
+ asm("sgdt %0" : "=m"(gdt));
12266
+ table_base = gdt.base;
12268
+ if (selector & 4) { /* from ldt */
12269
+ u16 ldt_selector;
12271
+ asm("sldt %0" : "=g"(ldt_selector));
12272
+ table_base = segment_base(ldt_selector);
12274
+ d = (struct segment_descriptor *)(table_base + (selector & ~7));
12275
+ v = d->base_low | ((unsigned long)d->base_mid << 16) |
12276
+ ((unsigned long)d->base_high << 24);
12277
+#ifdef CONFIG_X86_64
12278
+ if (d->system == 0 && (d->type == 2 || d->type == 9 || d->type == 11))
12279
+ v |= ((unsigned long) \
12280
+ ((struct segment_descriptor_64 *)d)->base_higher) << 32;
12284
+EXPORT_SYMBOL_GPL(segment_base);
12286
+u64 kvm_get_apic_base(struct kvm_vcpu *vcpu)
12288
+ if (irqchip_in_kernel(vcpu->kvm))
12289
+ return vcpu->apic_base;
12291
+ return vcpu->apic_base;
12293
+EXPORT_SYMBOL_GPL(kvm_get_apic_base);
12295
+void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data)
12297
+ /* TODO: reserve bits check */
12298
+ if (irqchip_in_kernel(vcpu->kvm))
12299
+ kvm_lapic_set_base(vcpu, data);
12301
+ vcpu->apic_base = data;
12303
+EXPORT_SYMBOL_GPL(kvm_set_apic_base);
12305
+static void inject_gp(struct kvm_vcpu *vcpu)
12307
+ kvm_x86_ops->inject_gp(vcpu, 0);
12311
+ * Load the pae pdptrs. Return true is they are all valid.
12313
+int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
12315
+ gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
12316
+ unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
12319
+ u64 pdpte[ARRAY_SIZE(vcpu->pdptrs)];
12321
+ mutex_lock(&vcpu->kvm->lock);
12322
+ ret = kvm_read_guest_page(vcpu->kvm, pdpt_gfn, pdpte,
12323
+ offset * sizeof(u64), sizeof(pdpte));
12328
+ for (i = 0; i < ARRAY_SIZE(pdpte); ++i) {
12329
+ if ((pdpte[i] & 1) && (pdpte[i] & 0xfffffff0000001e6ull)) {
12336
+ memcpy(vcpu->pdptrs, pdpte, sizeof(vcpu->pdptrs));
12338
+ mutex_unlock(&vcpu->kvm->lock);
12343
+static bool pdptrs_changed(struct kvm_vcpu *vcpu)
12345
+ u64 pdpte[ARRAY_SIZE(vcpu->pdptrs)];
12346
+ bool changed = true;
12349
+ if (is_long_mode(vcpu) || !is_pae(vcpu))
12352
+ mutex_lock(&vcpu->kvm->lock);
12353
+ r = kvm_read_guest(vcpu->kvm, vcpu->cr3 & ~31u, pdpte, sizeof(pdpte));
12356
+ changed = memcmp(pdpte, vcpu->pdptrs, sizeof(pdpte)) != 0;
12358
+ mutex_unlock(&vcpu->kvm->lock);
12363
+void set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
12365
+ if (cr0 & CR0_RESERVED_BITS) {
12366
+ printk(KERN_DEBUG "set_cr0: 0x%lx #GP, reserved bits 0x%lx\n",
12372
+ if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) {
12373
+ printk(KERN_DEBUG "set_cr0: #GP, CD == 0 && NW == 1\n");
12378
+ if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) {
12379
+ printk(KERN_DEBUG "set_cr0: #GP, set PG flag "
12380
+ "and a clear PE flag\n");
12385
+ if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
12386
+#ifdef CONFIG_X86_64
12387
+ if ((vcpu->shadow_efer & EFER_LME)) {
12390
+ if (!is_pae(vcpu)) {
12391
+ printk(KERN_DEBUG "set_cr0: #GP, start paging "
12392
+ "in long mode while PAE is disabled\n");
12396
+ kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
12398
+ printk(KERN_DEBUG "set_cr0: #GP, start paging "
12399
+ "in long mode while CS.L == 1\n");
12406
+ if (is_pae(vcpu) && !load_pdptrs(vcpu, vcpu->cr3)) {
12407
+ printk(KERN_DEBUG "set_cr0: #GP, pdptrs "
12408
+ "reserved bits\n");
12415
+ kvm_x86_ops->set_cr0(vcpu, cr0);
12418
+ mutex_lock(&vcpu->kvm->lock);
12419
+ kvm_mmu_reset_context(vcpu);
12420
+ mutex_unlock(&vcpu->kvm->lock);
12423
+EXPORT_SYMBOL_GPL(set_cr0);
12425
+void lmsw(struct kvm_vcpu *vcpu, unsigned long msw)
12427
+ set_cr0(vcpu, (vcpu->cr0 & ~0x0ful) | (msw & 0x0f));
12429
+EXPORT_SYMBOL_GPL(lmsw);
12431
+void set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
12433
+ if (cr4 & CR4_RESERVED_BITS) {
12434
+ printk(KERN_DEBUG "set_cr4: #GP, reserved bits\n");
12439
+ if (is_long_mode(vcpu)) {
12440
+ if (!(cr4 & X86_CR4_PAE)) {
12441
+ printk(KERN_DEBUG "set_cr4: #GP, clearing PAE while "
12442
+ "in long mode\n");
12446
+ } else if (is_paging(vcpu) && !is_pae(vcpu) && (cr4 & X86_CR4_PAE)
12447
+ && !load_pdptrs(vcpu, vcpu->cr3)) {
12448
+ printk(KERN_DEBUG "set_cr4: #GP, pdptrs reserved bits\n");
12453
+ if (cr4 & X86_CR4_VMXE) {
12454
+ printk(KERN_DEBUG "set_cr4: #GP, setting VMXE\n");
12458
+ kvm_x86_ops->set_cr4(vcpu, cr4);
12460
+ mutex_lock(&vcpu->kvm->lock);
12461
+ kvm_mmu_reset_context(vcpu);
12462
+ mutex_unlock(&vcpu->kvm->lock);
12464
+EXPORT_SYMBOL_GPL(set_cr4);
12466
+void set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
12468
+ if (cr3 == vcpu->cr3 && !pdptrs_changed(vcpu)) {
12469
+ kvm_mmu_flush_tlb(vcpu);
12473
+ if (is_long_mode(vcpu)) {
12474
+ if (cr3 & CR3_L_MODE_RESERVED_BITS) {
12475
+ printk(KERN_DEBUG "set_cr3: #GP, reserved bits\n");
12480
+ if (is_pae(vcpu)) {
12481
+ if (cr3 & CR3_PAE_RESERVED_BITS) {
12482
+ printk(KERN_DEBUG
12483
+ "set_cr3: #GP, reserved bits\n");
12487
+ if (is_paging(vcpu) && !load_pdptrs(vcpu, cr3)) {
12488
+ printk(KERN_DEBUG "set_cr3: #GP, pdptrs "
12489
+ "reserved bits\n");
12495
+ * We don't check reserved bits in nonpae mode, because
12496
+ * this isn't enforced, and VMware depends on this.
12500
+ mutex_lock(&vcpu->kvm->lock);
12502
+ * Does the new cr3 value map to physical memory? (Note, we
12503
+ * catch an invalid cr3 even in real-mode, because it would
12504
+ * cause trouble later on when we turn on paging anyway.)
12506
+ * A real CPU would silently accept an invalid cr3 and would
12507
+ * attempt to use it - with largely undefined (and often hard
12508
+ * to debug) behavior on the guest side.
12510
+ if (unlikely(!gfn_to_memslot(vcpu->kvm, cr3 >> PAGE_SHIFT)))
12514
+ vcpu->mmu.new_cr3(vcpu);
12516
+ mutex_unlock(&vcpu->kvm->lock);
12518
+EXPORT_SYMBOL_GPL(set_cr3);
12520
+void set_cr8(struct kvm_vcpu *vcpu, unsigned long cr8)
12522
+ if (cr8 & CR8_RESERVED_BITS) {
12523
+ printk(KERN_DEBUG "set_cr8: #GP, reserved bits 0x%lx\n", cr8);
12527
+ if (irqchip_in_kernel(vcpu->kvm))
12528
+ kvm_lapic_set_tpr(vcpu, cr8);
12532
+EXPORT_SYMBOL_GPL(set_cr8);
12534
+unsigned long get_cr8(struct kvm_vcpu *vcpu)
12536
+ if (irqchip_in_kernel(vcpu->kvm))
12537
+ return kvm_lapic_get_cr8(vcpu);
12539
+ return vcpu->cr8;
12541
+EXPORT_SYMBOL_GPL(get_cr8);
12544
+ * List of msr numbers which we expose to userspace through KVM_GET_MSRS
12545
+ * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
12547
+ * This list is modified at module load time to reflect the
12548
+ * capabilities of the host cpu.
12550
+static u32 msrs_to_save[] = {
12551
+ MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
12553
+#ifdef CONFIG_X86_64
12554
+ MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
12556
+ MSR_IA32_TIME_STAMP_COUNTER,
12559
+static unsigned num_msrs_to_save;
12561
+static u32 emulated_msrs[] = {
12562
+ MSR_IA32_MISC_ENABLE,
12565
+#ifdef CONFIG_X86_64
12567
+static void set_efer(struct kvm_vcpu *vcpu, u64 efer)
12569
+ if (efer & EFER_RESERVED_BITS) {
12570
+ printk(KERN_DEBUG "set_efer: 0x%llx #GP, reserved bits\n",
12576
+ if (is_paging(vcpu)
12577
+ && (vcpu->shadow_efer & EFER_LME) != (efer & EFER_LME)) {
12578
+ printk(KERN_DEBUG "set_efer: #GP, change LME while paging\n");
12583
+ kvm_x86_ops->set_efer(vcpu, efer);
12585
+ efer &= ~EFER_LMA;
12586
+ efer |= vcpu->shadow_efer & EFER_LMA;
12588
+ vcpu->shadow_efer = efer;
12594
+ * Writes msr value into into the appropriate "register".
12595
+ * Returns 0 on success, non-0 otherwise.
12596
+ * Assumes vcpu_load() was already called.
12598
+int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
12600
+ return kvm_x86_ops->set_msr(vcpu, msr_index, data);
12604
+ * Adapt set_msr() to msr_io()'s calling convention
12606
+static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
12608
+ return kvm_set_msr(vcpu, index, *data);
12612
+int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
12615
+#ifdef CONFIG_X86_64
12617
+ set_efer(vcpu, data);
12620
+ case MSR_IA32_MC0_STATUS:
12621
+ pr_unimpl(vcpu, "%s: MSR_IA32_MC0_STATUS 0x%llx, nop\n",
12622
+ __FUNCTION__, data);
12624
+ case MSR_IA32_MCG_STATUS:
12625
+ pr_unimpl(vcpu, "%s: MSR_IA32_MCG_STATUS 0x%llx, nop\n",
12626
+ __FUNCTION__, data);
12628
+ case MSR_IA32_UCODE_REV:
12629
+ case MSR_IA32_UCODE_WRITE:
12630
+ case 0x200 ... 0x2ff: /* MTRRs */
12632
+ case MSR_IA32_APICBASE:
12633
+ kvm_set_apic_base(vcpu, data);
12635
+ case MSR_IA32_MISC_ENABLE:
12636
+ vcpu->ia32_misc_enable_msr = data;
12639
+ pr_unimpl(vcpu, "unhandled wrmsr: 0x%x\n", msr);
12644
+EXPORT_SYMBOL_GPL(kvm_set_msr_common);
12648
+ * Reads an msr value (of 'msr_index') into 'pdata'.
12649
+ * Returns 0 on success, non-0 otherwise.
12650
+ * Assumes vcpu_load() was already called.
12652
+int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
12654
+ return kvm_x86_ops->get_msr(vcpu, msr_index, pdata);
12657
+int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
12662
+ case 0xc0010010: /* SYSCFG */
12663
+ case 0xc0010015: /* HWCR */
12664
+ case MSR_IA32_PLATFORM_ID:
12665
+ case MSR_IA32_P5_MC_ADDR:
12666
+ case MSR_IA32_P5_MC_TYPE:
12667
+ case MSR_IA32_MC0_CTL:
12668
+ case MSR_IA32_MCG_STATUS:
12669
+ case MSR_IA32_MCG_CAP:
12670
+ case MSR_IA32_MC0_MISC:
12671
+ case MSR_IA32_MC0_MISC+4:
12672
+ case MSR_IA32_MC0_MISC+8:
12673
+ case MSR_IA32_MC0_MISC+12:
12674
+ case MSR_IA32_MC0_MISC+16:
12675
+ case MSR_IA32_UCODE_REV:
12676
+ case MSR_IA32_PERF_STATUS:
12677
+ case MSR_IA32_EBL_CR_POWERON:
12678
+ /* MTRR registers */
12680
+ case 0x200 ... 0x2ff:
12683
+ case 0xcd: /* fsb frequency */
12686
+ case MSR_IA32_APICBASE:
12687
+ data = kvm_get_apic_base(vcpu);
12689
+ case MSR_IA32_MISC_ENABLE:
12690
+ data = vcpu->ia32_misc_enable_msr;
12692
+#ifdef CONFIG_X86_64
12694
+ data = vcpu->shadow_efer;
12698
+ pr_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", msr);
12704
+EXPORT_SYMBOL_GPL(kvm_get_msr_common);
12707
+ * Read or write a bunch of msrs. All parameters are kernel addresses.
12709
+ * @return number of msrs set successfully.
12711
+static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
12712
+ struct kvm_msr_entry *entries,
12713
+ int (*do_msr)(struct kvm_vcpu *vcpu,
12714
+ unsigned index, u64 *data))
12720
+ for (i = 0; i < msrs->nmsrs; ++i)
12721
+ if (do_msr(vcpu, entries[i].index, &entries[i].data))
12730
+ * Read or write a bunch of msrs. Parameters are user addresses.
12732
+ * @return number of msrs set successfully.
12734
+static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
12735
+ int (*do_msr)(struct kvm_vcpu *vcpu,
12736
+ unsigned index, u64 *data),
12739
+ struct kvm_msrs msrs;
12740
+ struct kvm_msr_entry *entries;
12745
+ if (copy_from_user(&msrs, user_msrs, sizeof msrs))
12749
+ if (msrs.nmsrs >= MAX_IO_MSRS)
12753
+ size = sizeof(struct kvm_msr_entry) * msrs.nmsrs;
12754
+ entries = vmalloc(size);
12759
+ if (copy_from_user(entries, user_msrs->entries, size))
12762
+ r = n = __msr_io(vcpu, &msrs, entries, do_msr);
12767
+ if (writeback && copy_to_user(user_msrs->entries, entries, size))
12779
+ * Make sure that a cpu that is being hot-unplugged does not have any vcpus
12782
+void decache_vcpus_on_cpu(int cpu)
12785
+ struct kvm_vcpu *vcpu;
12788
+ spin_lock(&kvm_lock);
12789
+ list_for_each_entry(vm, &vm_list, vm_list)
12790
+ for (i = 0; i < KVM_MAX_VCPUS; ++i) {
12791
+ vcpu = vm->vcpus[i];
12795
+ * If the vcpu is locked, then it is running on some
12796
+ * other cpu and therefore it is not cached on the
12797
+ * cpu in question.
12799
+ * If it's not locked, check the last cpu it executed
12802
+ if (mutex_trylock(&vcpu->mutex)) {
12803
+ if (vcpu->cpu == cpu) {
12804
+ kvm_x86_ops->vcpu_decache(vcpu);
12807
+ mutex_unlock(&vcpu->mutex);
12810
+ spin_unlock(&kvm_lock);
12813
+int kvm_dev_ioctl_check_extension(long ext)
12818
+ case KVM_CAP_IRQCHIP:
12819
+ case KVM_CAP_HLT:
12820
+ case KVM_CAP_MMU_SHADOW_CACHE_CONTROL:
12821
+ case KVM_CAP_USER_MEMORY:
12822
+ case KVM_CAP_SET_TSS_ADDR:
12823
+ case KVM_CAP_EXT_CPUID:
12834
+long kvm_arch_dev_ioctl(struct file *filp,
12835
+ unsigned int ioctl, unsigned long arg)
12837
+ void __user *argp = (void __user *)arg;
12841
+ case KVM_GET_MSR_INDEX_LIST: {
12842
+ struct kvm_msr_list __user *user_msr_list = argp;
12843
+ struct kvm_msr_list msr_list;
12847
+ if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
12849
+ n = msr_list.nmsrs;
12850
+ msr_list.nmsrs = num_msrs_to_save + ARRAY_SIZE(emulated_msrs);
12851
+ if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
12854
+ if (n < num_msrs_to_save)
12857
+ if (copy_to_user(user_msr_list->indices, &msrs_to_save,
12858
+ num_msrs_to_save * sizeof(u32)))
12860
+ if (copy_to_user(user_msr_list->indices
12861
+ + num_msrs_to_save * sizeof(u32),
12863
+ ARRAY_SIZE(emulated_msrs) * sizeof(u32)))
12875
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
12877
+ kvm_x86_ops->vcpu_load(vcpu, cpu);
12880
+void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
12882
+ kvm_x86_ops->vcpu_put(vcpu);
12883
+ kvm_put_guest_fpu(vcpu);
12886
+static int is_efer_nx(void)
12890
+ rdmsrl(MSR_EFER, efer);
12891
+ return efer & EFER_NX;
12894
+static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
12897
+ struct kvm_cpuid_entry2 *e, *entry;
12900
+ for (i = 0; i < vcpu->cpuid_nent; ++i) {
12901
+ e = &vcpu->cpuid_entries[i];
12902
+ if (e->function == 0x80000001) {
12907
+ if (entry && (entry->edx & (1 << 20)) && !is_efer_nx()) {
12908
+ entry->edx &= ~(1 << 20);
12909
+ printk(KERN_INFO "kvm: guest NX capability removed\n");
12913
+/* when an old userspace process fills a new kernel module */
12914
+static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
12915
+ struct kvm_cpuid *cpuid,
12916
+ struct kvm_cpuid_entry __user *entries)
12919
+ struct kvm_cpuid_entry *cpuid_entries;
12922
+ if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
12925
+ cpuid_entries = vmalloc(sizeof(struct kvm_cpuid_entry) * cpuid->nent);
12926
+ if (!cpuid_entries)
12929
+ if (copy_from_user(cpuid_entries, entries,
12930
+ cpuid->nent * sizeof(struct kvm_cpuid_entry)))
12932
+ for (i = 0; i < cpuid->nent; i++) {
12933
+ vcpu->cpuid_entries[i].function = cpuid_entries[i].function;
12934
+ vcpu->cpuid_entries[i].eax = cpuid_entries[i].eax;
12935
+ vcpu->cpuid_entries[i].ebx = cpuid_entries[i].ebx;
12936
+ vcpu->cpuid_entries[i].ecx = cpuid_entries[i].ecx;
12937
+ vcpu->cpuid_entries[i].edx = cpuid_entries[i].edx;
12938
+ vcpu->cpuid_entries[i].index = 0;
12939
+ vcpu->cpuid_entries[i].flags = 0;
12940
+ vcpu->cpuid_entries[i].padding[0] = 0;
12941
+ vcpu->cpuid_entries[i].padding[1] = 0;
12942
+ vcpu->cpuid_entries[i].padding[2] = 0;
12944
+ vcpu->cpuid_nent = cpuid->nent;
12945
+ cpuid_fix_nx_cap(vcpu);
12949
+ vfree(cpuid_entries);
12954
+static int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu *vcpu,
12955
+ struct kvm_cpuid2 *cpuid,
12956
+ struct kvm_cpuid_entry2 __user *entries)
12961
+ if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
12964
+ if (copy_from_user(&vcpu->cpuid_entries, entries,
12965
+ cpuid->nent * sizeof(struct kvm_cpuid_entry2)))
12967
+ vcpu->cpuid_nent = cpuid->nent;
12974
+static int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
12975
+ struct kvm_cpuid2 *cpuid,
12976
+ struct kvm_cpuid_entry2 __user *entries)
12981
+ if (cpuid->nent < vcpu->cpuid_nent)
12984
+ if (copy_to_user(entries, &vcpu->cpuid_entries,
12985
+ vcpu->cpuid_nent * sizeof(struct kvm_cpuid_entry2)))
12990
+ cpuid->nent = vcpu->cpuid_nent;
12994
+static inline u32 bit(int bitno)
12996
+ return 1 << (bitno & 31);
12999
+static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
13002
+ entry->function = function;
13003
+ entry->index = index;
13004
+ cpuid_count(entry->function, entry->index,
13005
+ &entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
13006
+ entry->flags = 0;
13009
+static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
13010
+ u32 index, int *nent, int maxnent)
13012
+ const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
13013
+ bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
13014
+ bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
13015
+ bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
13016
+ bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
13017
+ bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
13018
+ bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
13019
+ bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
13020
+ bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
13021
+ bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
13022
+ const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
13023
+ bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
13024
+ bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
13025
+ bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
13026
+ bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
13027
+ bit(X86_FEATURE_PGE) |
13028
+ bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
13029
+ bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
13030
+ bit(X86_FEATURE_SYSCALL) |
13031
+ (bit(X86_FEATURE_NX) && is_efer_nx()) |
13032
+#ifdef CONFIG_X86_64
13033
+ bit(X86_FEATURE_LM) |
13035
+ bit(X86_FEATURE_MMXEXT) |
13036
+ bit(X86_FEATURE_3DNOWEXT) |
13037
+ bit(X86_FEATURE_3DNOW);
13038
+ const u32 kvm_supported_word3_x86_features =
13039
+ bit(X86_FEATURE_XMM3) | bit(X86_FEATURE_CX16);
13040
+ const u32 kvm_supported_word6_x86_features =
13041
+ bit(X86_FEATURE_LAHF_LM) | bit(X86_FEATURE_CMP_LEGACY);
13043
+ /* all func 2 cpuid_count() should be called on the same cpu */
13045
+ do_cpuid_1_ent(entry, function, index);
13048
+ switch (function) {
13050
+ entry->eax = min(entry->eax, (u32)0xb);
13053
+ entry->edx &= kvm_supported_word0_x86_features;
13054
+ entry->ecx &= kvm_supported_word3_x86_features;
13056
+ /* function 2 entries are STATEFUL. That is, repeated cpuid commands
13057
+ * may return different values. This forces us to get_cpu() before
13058
+ * issuing the first command, and also to emulate this annoying behavior
13059
+ * in kvm_emulate_cpuid() using KVM_CPUID_FLAG_STATE_READ_NEXT */
13061
+ int t, times = entry->eax & 0xff;
13063
+ entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
13064
+ for (t = 1; t < times && *nent < maxnent; ++t) {
13065
+ do_cpuid_1_ent(&entry[t], function, 0);
13066
+ entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
13071
+ /* function 4 and 0xb have additional index. */
13073
+ int index, cache_type;
13075
+ entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
13076
+ /* read more entries until cache_type is zero */
13077
+ for (index = 1; *nent < maxnent; ++index) {
13078
+ cache_type = entry[index - 1].eax & 0x1f;
13081
+ do_cpuid_1_ent(&entry[index], function, index);
13082
+ entry[index].flags |=
13083
+ KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
13089
+ int index, level_type;
13091
+ entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
13092
+ /* read more entries until level_type is zero */
13093
+ for (index = 1; *nent < maxnent; ++index) {
13094
+ level_type = entry[index - 1].ecx & 0xff;
13097
+ do_cpuid_1_ent(&entry[index], function, index);
13098
+ entry[index].flags |=
13099
+ KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
13105
+ entry->eax = min(entry->eax, 0x8000001a);
13108
+ entry->edx &= kvm_supported_word1_x86_features;
13109
+ entry->ecx &= kvm_supported_word6_x86_features;
13115
+static int kvm_vm_ioctl_get_supported_cpuid(struct kvm *kvm,
13116
+ struct kvm_cpuid2 *cpuid,
13117
+ struct kvm_cpuid_entry2 __user *entries)
13119
+ struct kvm_cpuid_entry2 *cpuid_entries;
13120
+ int limit, nent = 0, r = -E2BIG;
13123
+ if (cpuid->nent < 1)
13126
+ cpuid_entries = vmalloc(sizeof(struct kvm_cpuid_entry2) * cpuid->nent);
13127
+ if (!cpuid_entries)
13130
+ do_cpuid_ent(&cpuid_entries[0], 0, 0, &nent, cpuid->nent);
13131
+ limit = cpuid_entries[0].eax;
13132
+ for (func = 1; func <= limit && nent < cpuid->nent; ++func)
13133
+ do_cpuid_ent(&cpuid_entries[nent], func, 0,
13134
+ &nent, cpuid->nent);
13136
+ if (nent >= cpuid->nent)
13139
+ do_cpuid_ent(&cpuid_entries[nent], 0x80000000, 0, &nent, cpuid->nent);
13140
+ limit = cpuid_entries[nent - 1].eax;
13141
+ for (func = 0x80000001; func <= limit && nent < cpuid->nent; ++func)
13142
+ do_cpuid_ent(&cpuid_entries[nent], func, 0,
13143
+ &nent, cpuid->nent);
13145
+ if (copy_to_user(entries, cpuid_entries,
13146
+ nent * sizeof(struct kvm_cpuid_entry2)))
13148
+ cpuid->nent = nent;
13152
+ vfree(cpuid_entries);
13157
+static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
13158
+ struct kvm_lapic_state *s)
13161
+ memcpy(s->regs, vcpu->apic->regs, sizeof *s);
13167
+static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu,
13168
+ struct kvm_lapic_state *s)
13171
+ memcpy(vcpu->apic->regs, s->regs, sizeof *s);
13172
+ kvm_apic_post_state_restore(vcpu);
13178
+static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
13179
+ struct kvm_interrupt *irq)
13181
+ if (irq->irq < 0 || irq->irq >= 256)
13183
+ if (irqchip_in_kernel(vcpu->kvm))
13187
+ set_bit(irq->irq, vcpu->irq_pending);
13188
+ set_bit(irq->irq / BITS_PER_LONG, &vcpu->irq_summary);
13195
+long kvm_arch_vcpu_ioctl(struct file *filp,
13196
+ unsigned int ioctl, unsigned long arg)
13198
+ struct kvm_vcpu *vcpu = filp->private_data;
13199
+ void __user *argp = (void __user *)arg;
13203
+ case KVM_GET_LAPIC: {
13204
+ struct kvm_lapic_state lapic;
13206
+ memset(&lapic, 0, sizeof lapic);
13207
+ r = kvm_vcpu_ioctl_get_lapic(vcpu, &lapic);
13211
+ if (copy_to_user(argp, &lapic, sizeof lapic))
13216
+ case KVM_SET_LAPIC: {
13217
+ struct kvm_lapic_state lapic;
13220
+ if (copy_from_user(&lapic, argp, sizeof lapic))
13222
+ r = kvm_vcpu_ioctl_set_lapic(vcpu, &lapic);;
13228
+ case KVM_INTERRUPT: {
13229
+ struct kvm_interrupt irq;
13232
+ if (copy_from_user(&irq, argp, sizeof irq))
13234
+ r = kvm_vcpu_ioctl_interrupt(vcpu, &irq);
13240
+ case KVM_SET_CPUID: {
13241
+ struct kvm_cpuid __user *cpuid_arg = argp;
13242
+ struct kvm_cpuid cpuid;
13245
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
13247
+ r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
13252
+ case KVM_SET_CPUID2: {
13253
+ struct kvm_cpuid2 __user *cpuid_arg = argp;
13254
+ struct kvm_cpuid2 cpuid;
13257
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
13259
+ r = kvm_vcpu_ioctl_set_cpuid2(vcpu, &cpuid,
13260
+ cpuid_arg->entries);
13265
+ case KVM_GET_CPUID2: {
13266
+ struct kvm_cpuid2 __user *cpuid_arg = argp;
13267
+ struct kvm_cpuid2 cpuid;
13270
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
13272
+ r = kvm_vcpu_ioctl_get_cpuid2(vcpu, &cpuid,
13273
+ cpuid_arg->entries);
13277
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof cpuid))
13282
+ case KVM_GET_MSRS:
13283
+ r = msr_io(vcpu, argp, kvm_get_msr, 1);
13285
+ case KVM_SET_MSRS:
13286
+ r = msr_io(vcpu, argp, do_set_msr, 0);
13295
+static int kvm_vm_ioctl_set_tss_addr(struct kvm *kvm, unsigned long addr)
13299
+ if (addr > (unsigned int)(-3 * PAGE_SIZE))
13301
+ ret = kvm_x86_ops->set_tss_addr(kvm, addr);
13305
+static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,
13306
+ u32 kvm_nr_mmu_pages)
13308
+ if (kvm_nr_mmu_pages < KVM_MIN_ALLOC_MMU_PAGES)
13311
+ mutex_lock(&kvm->lock);
13313
+ kvm_mmu_change_mmu_pages(kvm, kvm_nr_mmu_pages);
13314
+ kvm->n_requested_mmu_pages = kvm_nr_mmu_pages;
13316
+ mutex_unlock(&kvm->lock);
13320
+static int kvm_vm_ioctl_get_nr_mmu_pages(struct kvm *kvm)
13322
+ return kvm->n_alloc_mmu_pages;
13325
+gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
13328
+ struct kvm_mem_alias *alias;
13330
+ for (i = 0; i < kvm->naliases; ++i) {
13331
+ alias = &kvm->aliases[i];
13332
+ if (gfn >= alias->base_gfn
13333
+ && gfn < alias->base_gfn + alias->npages)
13334
+ return alias->target_gfn + gfn - alias->base_gfn;
13340
+ * Set a new alias region. Aliases map a portion of physical memory into
13341
+ * another portion. This is useful for memory windows, for example the PC
13344
+static int kvm_vm_ioctl_set_memory_alias(struct kvm *kvm,
13345
+ struct kvm_memory_alias *alias)
13348
+ struct kvm_mem_alias *p;
13351
+ /* General sanity checks */
13352
+ if (alias->memory_size & (PAGE_SIZE - 1))
13354
+ if (alias->guest_phys_addr & (PAGE_SIZE - 1))
13356
+ if (alias->slot >= KVM_ALIAS_SLOTS)
13358
+ if (alias->guest_phys_addr + alias->memory_size
13359
+ < alias->guest_phys_addr)
13361
+ if (alias->target_phys_addr + alias->memory_size
13362
+ < alias->target_phys_addr)
13365
+ mutex_lock(&kvm->lock);
13367
+ p = &kvm->aliases[alias->slot];
13368
+ p->base_gfn = alias->guest_phys_addr >> PAGE_SHIFT;
13369
+ p->npages = alias->memory_size >> PAGE_SHIFT;
13370
+ p->target_gfn = alias->target_phys_addr >> PAGE_SHIFT;
13372
+ for (n = KVM_ALIAS_SLOTS; n > 0; --n)
13373
+ if (kvm->aliases[n - 1].npages)
13375
+ kvm->naliases = n;
13377
+ kvm_mmu_zap_all(kvm);
13379
+ mutex_unlock(&kvm->lock);
13387
+static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
13392
+ switch (chip->chip_id) {
13393
+ case KVM_IRQCHIP_PIC_MASTER:
13394
+ memcpy(&chip->chip.pic,
13395
+ &pic_irqchip(kvm)->pics[0],
13396
+ sizeof(struct kvm_pic_state));
13398
+ case KVM_IRQCHIP_PIC_SLAVE:
13399
+ memcpy(&chip->chip.pic,
13400
+ &pic_irqchip(kvm)->pics[1],
13401
+ sizeof(struct kvm_pic_state));
13403
+ case KVM_IRQCHIP_IOAPIC:
13404
+ memcpy(&chip->chip.ioapic,
13405
+ ioapic_irqchip(kvm),
13406
+ sizeof(struct kvm_ioapic_state));
13415
+static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
13420
+ switch (chip->chip_id) {
13421
+ case KVM_IRQCHIP_PIC_MASTER:
13422
+ memcpy(&pic_irqchip(kvm)->pics[0],
13424
+ sizeof(struct kvm_pic_state));
13426
+ case KVM_IRQCHIP_PIC_SLAVE:
13427
+ memcpy(&pic_irqchip(kvm)->pics[1],
13429
+ sizeof(struct kvm_pic_state));
13431
+ case KVM_IRQCHIP_IOAPIC:
13432
+ memcpy(ioapic_irqchip(kvm),
13433
+ &chip->chip.ioapic,
13434
+ sizeof(struct kvm_ioapic_state));
13440
+ kvm_pic_update_irq(pic_irqchip(kvm));
13445
+ * Get (and clear) the dirty memory log for a memory slot.
13447
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
13448
+ struct kvm_dirty_log *log)
13452
+ struct kvm_memory_slot *memslot;
13453
+ int is_dirty = 0;
13455
+ mutex_lock(&kvm->lock);
13457
+ r = kvm_get_dirty_log(kvm, log, &is_dirty);
13461
+ /* If nothing is dirty, don't bother messing with page tables. */
13463
+ kvm_mmu_slot_remove_write_access(kvm, log->slot);
13464
+ kvm_flush_remote_tlbs(kvm);
13465
+ memslot = &kvm->memslots[log->slot];
13466
+ n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
13467
+ memset(memslot->dirty_bitmap, 0, n);
13471
+ mutex_unlock(&kvm->lock);
13475
+long kvm_arch_vm_ioctl(struct file *filp,
13476
+ unsigned int ioctl, unsigned long arg)
13478
+ struct kvm *kvm = filp->private_data;
13479
+ void __user *argp = (void __user *)arg;
13483
+ case KVM_SET_TSS_ADDR:
13484
+ r = kvm_vm_ioctl_set_tss_addr(kvm, arg);
13488
+ case KVM_SET_MEMORY_REGION: {
13489
+ struct kvm_memory_region kvm_mem;
13490
+ struct kvm_userspace_memory_region kvm_userspace_mem;
13493
+ if (copy_from_user(&kvm_mem, argp, sizeof kvm_mem))
13495
+ kvm_userspace_mem.slot = kvm_mem.slot;
13496
+ kvm_userspace_mem.flags = kvm_mem.flags;
13497
+ kvm_userspace_mem.guest_phys_addr = kvm_mem.guest_phys_addr;
13498
+ kvm_userspace_mem.memory_size = kvm_mem.memory_size;
13499
+ r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem, 0);
13504
+ case KVM_SET_NR_MMU_PAGES:
13505
+ r = kvm_vm_ioctl_set_nr_mmu_pages(kvm, arg);
13509
+ case KVM_GET_NR_MMU_PAGES:
13510
+ r = kvm_vm_ioctl_get_nr_mmu_pages(kvm);
13512
+ case KVM_SET_MEMORY_ALIAS: {
13513
+ struct kvm_memory_alias alias;
13516
+ if (copy_from_user(&alias, argp, sizeof alias))
13518
+ r = kvm_vm_ioctl_set_memory_alias(kvm, &alias);
13523
+ case KVM_CREATE_IRQCHIP:
13525
+ kvm->vpic = kvm_create_pic(kvm);
13527
+ r = kvm_ioapic_init(kvm);
13529
+ kfree(kvm->vpic);
13530
+ kvm->vpic = NULL;
13536
+ case KVM_IRQ_LINE: {
13537
+ struct kvm_irq_level irq_event;
13540
+ if (copy_from_user(&irq_event, argp, sizeof irq_event))
13542
+ if (irqchip_in_kernel(kvm)) {
13543
+ mutex_lock(&kvm->lock);
13544
+ if (irq_event.irq < 16)
13545
+ kvm_pic_set_irq(pic_irqchip(kvm),
13547
+ irq_event.level);
13548
+ kvm_ioapic_set_irq(kvm->vioapic,
13550
+ irq_event.level);
13551
+ mutex_unlock(&kvm->lock);
13556
+ case KVM_GET_IRQCHIP: {
13557
+ /* 0: PIC master, 1: PIC slave, 2: IOAPIC */
13558
+ struct kvm_irqchip chip;
13561
+ if (copy_from_user(&chip, argp, sizeof chip))
13564
+ if (!irqchip_in_kernel(kvm))
13566
+ r = kvm_vm_ioctl_get_irqchip(kvm, &chip);
13570
+ if (copy_to_user(argp, &chip, sizeof chip))
13575
+ case KVM_SET_IRQCHIP: {
13576
+ /* 0: PIC master, 1: PIC slave, 2: IOAPIC */
13577
+ struct kvm_irqchip chip;
13580
+ if (copy_from_user(&chip, argp, sizeof chip))
13583
+ if (!irqchip_in_kernel(kvm))
13585
+ r = kvm_vm_ioctl_set_irqchip(kvm, &chip);
13591
+ case KVM_GET_SUPPORTED_CPUID: {
13592
+ struct kvm_cpuid2 __user *cpuid_arg = argp;
13593
+ struct kvm_cpuid2 cpuid;
13596
+ if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
13598
+ r = kvm_vm_ioctl_get_supported_cpuid(kvm, &cpuid,
13599
+ cpuid_arg->entries);
13604
+ if (copy_to_user(cpuid_arg, &cpuid, sizeof cpuid))
13616
+static void kvm_init_msr_list(void)
13621
+ for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
13622
+ if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
13625
+ msrs_to_save[j] = msrs_to_save[i];
13628
+ num_msrs_to_save = j;
13632
+ * Only apic need an MMIO device hook, so shortcut now..
13634
+static struct kvm_io_device *vcpu_find_pervcpu_dev(struct kvm_vcpu *vcpu,
13637
+ struct kvm_io_device *dev;
13639
+ if (vcpu->apic) {
13640
+ dev = &vcpu->apic->dev;
13641
+ if (dev->in_range(dev, addr))
13648
+static struct kvm_io_device *vcpu_find_mmio_dev(struct kvm_vcpu *vcpu,
13651
+ struct kvm_io_device *dev;
13653
+ dev = vcpu_find_pervcpu_dev(vcpu, addr);
13655
+ dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr);
13659
+int emulator_read_std(unsigned long addr,
13661
+ unsigned int bytes,
13662
+ struct kvm_vcpu *vcpu)
13664
+ void *data = val;
13667
+ gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
13668
+ unsigned offset = addr & (PAGE_SIZE-1);
13669
+ unsigned tocopy = min(bytes, (unsigned)PAGE_SIZE - offset);
13672
+ if (gpa == UNMAPPED_GVA)
13673
+ return X86EMUL_PROPAGATE_FAULT;
13674
+ ret = kvm_read_guest(vcpu->kvm, gpa, data, tocopy);
13676
+ return X86EMUL_UNHANDLEABLE;
13683
+ return X86EMUL_CONTINUE;
13685
+EXPORT_SYMBOL_GPL(emulator_read_std);
13687
+static int emulator_read_emulated(unsigned long addr,
13689
+ unsigned int bytes,
13690
+ struct kvm_vcpu *vcpu)
13692
+ struct kvm_io_device *mmio_dev;
13695
+ if (vcpu->mmio_read_completed) {
13696
+ memcpy(val, vcpu->mmio_data, bytes);
13697
+ vcpu->mmio_read_completed = 0;
13698
+ return X86EMUL_CONTINUE;
13701
+ gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
13703
+ /* For APIC access vmexit */
13704
+ if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
13707
+ if (emulator_read_std(addr, val, bytes, vcpu)
13708
+ == X86EMUL_CONTINUE)
13709
+ return X86EMUL_CONTINUE;
13710
+ if (gpa == UNMAPPED_GVA)
13711
+ return X86EMUL_PROPAGATE_FAULT;
13715
+ * Is this MMIO handled locally?
13717
+ mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
13719
+ kvm_iodevice_read(mmio_dev, gpa, bytes, val);
13720
+ return X86EMUL_CONTINUE;
13723
+ vcpu->mmio_needed = 1;
13724
+ vcpu->mmio_phys_addr = gpa;
13725
+ vcpu->mmio_size = bytes;
13726
+ vcpu->mmio_is_write = 0;
13728
+ return X86EMUL_UNHANDLEABLE;
13731
+static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
13732
+ const void *val, int bytes)
13736
+ ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
13739
+ kvm_mmu_pte_write(vcpu, gpa, val, bytes);
13743
+static int emulator_write_emulated_onepage(unsigned long addr,
13745
+ unsigned int bytes,
13746
+ struct kvm_vcpu *vcpu)
13748
+ struct kvm_io_device *mmio_dev;
13749
+ gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
13751
+ if (gpa == UNMAPPED_GVA) {
13752
+ kvm_x86_ops->inject_page_fault(vcpu, addr, 2);
13753
+ return X86EMUL_PROPAGATE_FAULT;
13756
+ /* For APIC access vmexit */
13757
+ if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
13760
+ if (emulator_write_phys(vcpu, gpa, val, bytes))
13761
+ return X86EMUL_CONTINUE;
13765
+ * Is this MMIO handled locally?
13767
+ mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
13769
+ kvm_iodevice_write(mmio_dev, gpa, bytes, val);
13770
+ return X86EMUL_CONTINUE;
13773
+ vcpu->mmio_needed = 1;
13774
+ vcpu->mmio_phys_addr = gpa;
13775
+ vcpu->mmio_size = bytes;
13776
+ vcpu->mmio_is_write = 1;
13777
+ memcpy(vcpu->mmio_data, val, bytes);
13779
+ return X86EMUL_CONTINUE;
13782
+int emulator_write_emulated(unsigned long addr,
13784
+ unsigned int bytes,
13785
+ struct kvm_vcpu *vcpu)
13787
+ /* Crossing a page boundary? */
13788
+ if (((addr + bytes - 1) ^ addr) & PAGE_MASK) {
13791
+ now = -addr & ~PAGE_MASK;
13792
+ rc = emulator_write_emulated_onepage(addr, val, now, vcpu);
13793
+ if (rc != X86EMUL_CONTINUE)
13799
+ return emulator_write_emulated_onepage(addr, val, bytes, vcpu);
13801
+EXPORT_SYMBOL_GPL(emulator_write_emulated);
13803
+static int emulator_cmpxchg_emulated(unsigned long addr,
13806
+ unsigned int bytes,
13807
+ struct kvm_vcpu *vcpu)
13809
+ static int reported;
13813
+ printk(KERN_WARNING "kvm: emulating exchange as write\n");
13815
+ return emulator_write_emulated(addr, new, bytes, vcpu);
13818
+static unsigned long get_segment_base(struct kvm_vcpu *vcpu, int seg)
13820
+ return kvm_x86_ops->get_segment_base(vcpu, seg);
13823
+int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address)
13825
+ return X86EMUL_CONTINUE;
13828
+int emulate_clts(struct kvm_vcpu *vcpu)
13830
+ kvm_x86_ops->set_cr0(vcpu, vcpu->cr0 & ~X86_CR0_TS);
13831
+ return X86EMUL_CONTINUE;
13834
+int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long *dest)
13836
+ struct kvm_vcpu *vcpu = ctxt->vcpu;
13840
+ *dest = kvm_x86_ops->get_dr(vcpu, dr);
13841
+ return X86EMUL_CONTINUE;
13843
+ pr_unimpl(vcpu, "%s: unexpected dr %u\n", __FUNCTION__, dr);
13844
+ return X86EMUL_UNHANDLEABLE;
13848
+int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value)
13850
+ unsigned long mask = (ctxt->mode == X86EMUL_MODE_PROT64) ? ~0ULL : ~0U;
13853
+ kvm_x86_ops->set_dr(ctxt->vcpu, dr, value & mask, &exception);
13855
+ /* FIXME: better handling */
13856
+ return X86EMUL_UNHANDLEABLE;
13858
+ return X86EMUL_CONTINUE;
13861
+void kvm_report_emulation_failure(struct kvm_vcpu *vcpu, const char *context)
13863
+ static int reported;
13865
+ unsigned long rip = vcpu->rip;
13866
+ unsigned long rip_linear;
13868
+ rip_linear = rip + get_segment_base(vcpu, VCPU_SREG_CS);
13873
+ emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu);
13875
+ printk(KERN_ERR "emulation failed (%s) rip %lx %02x %02x %02x %02x\n",
13876
+ context, rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]);
13879
+EXPORT_SYMBOL_GPL(kvm_report_emulation_failure);
13881
+struct x86_emulate_ops emulate_ops = {
13882
+ .read_std = emulator_read_std,
13883
+ .read_emulated = emulator_read_emulated,
13884
+ .write_emulated = emulator_write_emulated,
13885
+ .cmpxchg_emulated = emulator_cmpxchg_emulated,
13888
+int emulate_instruction(struct kvm_vcpu *vcpu,
13889
+ struct kvm_run *run,
13890
+ unsigned long cr2,
13896
+ vcpu->mmio_fault_cr2 = cr2;
13897
+ kvm_x86_ops->cache_regs(vcpu);
13899
+ vcpu->mmio_is_write = 0;
13900
+ vcpu->pio.string = 0;
13902
+ if (!no_decode) {
13904
+ kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
13906
+ vcpu->emulate_ctxt.vcpu = vcpu;
13907
+ vcpu->emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
13908
+ vcpu->emulate_ctxt.mode =
13909
+ (vcpu->emulate_ctxt.eflags & X86_EFLAGS_VM)
13910
+ ? X86EMUL_MODE_REAL : cs_l
13911
+ ? X86EMUL_MODE_PROT64 : cs_db
13912
+ ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
13914
+ if (vcpu->emulate_ctxt.mode == X86EMUL_MODE_PROT64) {
13915
+ vcpu->emulate_ctxt.cs_base = 0;
13916
+ vcpu->emulate_ctxt.ds_base = 0;
13917
+ vcpu->emulate_ctxt.es_base = 0;
13918
+ vcpu->emulate_ctxt.ss_base = 0;
13920
+ vcpu->emulate_ctxt.cs_base =
13921
+ get_segment_base(vcpu, VCPU_SREG_CS);
13922
+ vcpu->emulate_ctxt.ds_base =
13923
+ get_segment_base(vcpu, VCPU_SREG_DS);
13924
+ vcpu->emulate_ctxt.es_base =
13925
+ get_segment_base(vcpu, VCPU_SREG_ES);
13926
+ vcpu->emulate_ctxt.ss_base =
13927
+ get_segment_base(vcpu, VCPU_SREG_SS);
13930
+ vcpu->emulate_ctxt.gs_base =
13931
+ get_segment_base(vcpu, VCPU_SREG_GS);
13932
+ vcpu->emulate_ctxt.fs_base =
13933
+ get_segment_base(vcpu, VCPU_SREG_FS);
13935
+ r = x86_decode_insn(&vcpu->emulate_ctxt, &emulate_ops);
13936
+ ++vcpu->stat.insn_emulation;
13938
+ ++vcpu->stat.insn_emulation_fail;
13939
+ if (kvm_mmu_unprotect_page_virt(vcpu, cr2))
13940
+ return EMULATE_DONE;
13941
+ return EMULATE_FAIL;
13945
+ r = x86_emulate_insn(&vcpu->emulate_ctxt, &emulate_ops);
13947
+ if (vcpu->pio.string)
13948
+ return EMULATE_DO_MMIO;
13950
+ if ((r || vcpu->mmio_is_write) && run) {
13951
+ run->exit_reason = KVM_EXIT_MMIO;
13952
+ run->mmio.phys_addr = vcpu->mmio_phys_addr;
13953
+ memcpy(run->mmio.data, vcpu->mmio_data, 8);
13954
+ run->mmio.len = vcpu->mmio_size;
13955
+ run->mmio.is_write = vcpu->mmio_is_write;
13959
+ if (kvm_mmu_unprotect_page_virt(vcpu, cr2))
13960
+ return EMULATE_DONE;
13961
+ if (!vcpu->mmio_needed) {
13962
+ kvm_report_emulation_failure(vcpu, "mmio");
13963
+ return EMULATE_FAIL;
13965
+ return EMULATE_DO_MMIO;
13968
+ kvm_x86_ops->decache_regs(vcpu);
13969
+ kvm_x86_ops->set_rflags(vcpu, vcpu->emulate_ctxt.eflags);
13971
+ if (vcpu->mmio_is_write) {
13972
+ vcpu->mmio_needed = 0;
13973
+ return EMULATE_DO_MMIO;
13976
+ return EMULATE_DONE;
13978
+EXPORT_SYMBOL_GPL(emulate_instruction);
13980
+static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
13984
+ for (i = 0; i < ARRAY_SIZE(vcpu->pio.guest_pages); ++i)
13985
+ if (vcpu->pio.guest_pages[i]) {
13986
+ kvm_release_page_dirty(vcpu->pio.guest_pages[i]);
13987
+ vcpu->pio.guest_pages[i] = NULL;
13991
+static int pio_copy_data(struct kvm_vcpu *vcpu)
13993
+ void *p = vcpu->pio_data;
13996
+ int nr_pages = vcpu->pio.guest_pages[1] ? 2 : 1;
13998
+ q = vmap(vcpu->pio.guest_pages, nr_pages, VM_READ|VM_WRITE,
14001
+ free_pio_guest_pages(vcpu);
14004
+ q += vcpu->pio.guest_page_offset;
14005
+ bytes = vcpu->pio.size * vcpu->pio.cur_count;
14006
+ if (vcpu->pio.in)
14007
+ memcpy(q, p, bytes);
14009
+ memcpy(p, q, bytes);
14010
+ q -= vcpu->pio.guest_page_offset;
14012
+ free_pio_guest_pages(vcpu);
14016
+int complete_pio(struct kvm_vcpu *vcpu)
14018
+ struct kvm_pio_request *io = &vcpu->pio;
14022
+ kvm_x86_ops->cache_regs(vcpu);
14024
+ if (!io->string) {
14026
+ memcpy(&vcpu->regs[VCPU_REGS_RAX], vcpu->pio_data,
14030
+ r = pio_copy_data(vcpu);
14032
+ kvm_x86_ops->cache_regs(vcpu);
14039
+ delta *= io->cur_count;
14041
+ * The size of the register should really depend on
14042
+ * current address size.
14044
+ vcpu->regs[VCPU_REGS_RCX] -= delta;
14048
+ delta *= io->size;
14050
+ vcpu->regs[VCPU_REGS_RDI] += delta;
14052
+ vcpu->regs[VCPU_REGS_RSI] += delta;
14055
+ kvm_x86_ops->decache_regs(vcpu);
14057
+ io->count -= io->cur_count;
14058
+ io->cur_count = 0;
14063
+static void kernel_pio(struct kvm_io_device *pio_dev,
14064
+ struct kvm_vcpu *vcpu,
14067
+ /* TODO: String I/O for in kernel device */
14069
+ mutex_lock(&vcpu->kvm->lock);
14070
+ if (vcpu->pio.in)
14071
+ kvm_iodevice_read(pio_dev, vcpu->pio.port,
14075
+ kvm_iodevice_write(pio_dev, vcpu->pio.port,
14078
+ mutex_unlock(&vcpu->kvm->lock);
14081
+static void pio_string_write(struct kvm_io_device *pio_dev,
14082
+ struct kvm_vcpu *vcpu)
14084
+ struct kvm_pio_request *io = &vcpu->pio;
14085
+ void *pd = vcpu->pio_data;
14088
+ mutex_lock(&vcpu->kvm->lock);
14089
+ for (i = 0; i < io->cur_count; i++) {
14090
+ kvm_iodevice_write(pio_dev, io->port,
14095
+ mutex_unlock(&vcpu->kvm->lock);
14098
+static struct kvm_io_device *vcpu_find_pio_dev(struct kvm_vcpu *vcpu,
14101
+ return kvm_io_bus_find_dev(&vcpu->kvm->pio_bus, addr);
14104
+int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
14105
+ int size, unsigned port)
14107
+ struct kvm_io_device *pio_dev;
14109
+ vcpu->run->exit_reason = KVM_EXIT_IO;
14110
+ vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
14111
+ vcpu->run->io.size = vcpu->pio.size = size;
14112
+ vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
14113
+ vcpu->run->io.count = vcpu->pio.count = vcpu->pio.cur_count = 1;
14114
+ vcpu->run->io.port = vcpu->pio.port = port;
14115
+ vcpu->pio.in = in;
14116
+ vcpu->pio.string = 0;
14117
+ vcpu->pio.down = 0;
14118
+ vcpu->pio.guest_page_offset = 0;
14119
+ vcpu->pio.rep = 0;
14121
+ kvm_x86_ops->cache_regs(vcpu);
14122
+ memcpy(vcpu->pio_data, &vcpu->regs[VCPU_REGS_RAX], 4);
14123
+ kvm_x86_ops->decache_regs(vcpu);
14125
+ kvm_x86_ops->skip_emulated_instruction(vcpu);
14127
+ pio_dev = vcpu_find_pio_dev(vcpu, port);
14129
+ kernel_pio(pio_dev, vcpu, vcpu->pio_data);
14130
+ complete_pio(vcpu);
14135
+EXPORT_SYMBOL_GPL(kvm_emulate_pio);
14137
+int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
14138
+ int size, unsigned long count, int down,
14139
+ gva_t address, int rep, unsigned port)
14141
+ unsigned now, in_page;
14143
+ int nr_pages = 1;
14144
+ struct page *page;
14145
+ struct kvm_io_device *pio_dev;
14147
+ vcpu->run->exit_reason = KVM_EXIT_IO;
14148
+ vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
14149
+ vcpu->run->io.size = vcpu->pio.size = size;
14150
+ vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
14151
+ vcpu->run->io.count = vcpu->pio.count = vcpu->pio.cur_count = count;
14152
+ vcpu->run->io.port = vcpu->pio.port = port;
14153
+ vcpu->pio.in = in;
14154
+ vcpu->pio.string = 1;
14155
+ vcpu->pio.down = down;
14156
+ vcpu->pio.guest_page_offset = offset_in_page(address);
14157
+ vcpu->pio.rep = rep;
14160
+ kvm_x86_ops->skip_emulated_instruction(vcpu);
14165
+ in_page = PAGE_SIZE - offset_in_page(address);
14167
+ in_page = offset_in_page(address) + size;
14168
+ now = min(count, (unsigned long)in_page / size);
14171
+ * String I/O straddles page boundary. Pin two guest pages
14172
+ * so that we satisfy atomicity constraints. Do just one
14173
+ * transaction to avoid complexity.
14180
+ * String I/O in reverse. Yuck. Kill the guest, fix later.
14182
+ pr_unimpl(vcpu, "guest string pio down\n");
14186
+ vcpu->run->io.count = now;
14187
+ vcpu->pio.cur_count = now;
14189
+ if (vcpu->pio.cur_count == vcpu->pio.count)
14190
+ kvm_x86_ops->skip_emulated_instruction(vcpu);
14192
+ for (i = 0; i < nr_pages; ++i) {
14193
+ mutex_lock(&vcpu->kvm->lock);
14194
+ page = gva_to_page(vcpu, address + i * PAGE_SIZE);
14195
+ vcpu->pio.guest_pages[i] = page;
14196
+ mutex_unlock(&vcpu->kvm->lock);
14199
+ free_pio_guest_pages(vcpu);
14204
+ pio_dev = vcpu_find_pio_dev(vcpu, port);
14205
+ if (!vcpu->pio.in) {
14206
+ /* string PIO write */
14207
+ ret = pio_copy_data(vcpu);
14208
+ if (ret >= 0 && pio_dev) {
14209
+ pio_string_write(pio_dev, vcpu);
14210
+ complete_pio(vcpu);
14211
+ if (vcpu->pio.count == 0)
14214
+ } else if (pio_dev)
14215
+ pr_unimpl(vcpu, "no string pio read support yet, "
14216
+ "port %x size %d count %ld\n",
14217
+ port, size, count);
14221
+EXPORT_SYMBOL_GPL(kvm_emulate_pio_string);
14223
+int kvm_arch_init(void *opaque)
14226
+ struct kvm_x86_ops *ops = (struct kvm_x86_ops *)opaque;
14228
+ r = kvm_mmu_module_init();
14232
+ kvm_init_msr_list();
14234
+ if (kvm_x86_ops) {
14235
+ printk(KERN_ERR "kvm: already loaded the other module\n");
14240
+ if (!ops->cpu_has_kvm_support()) {
14241
+ printk(KERN_ERR "kvm: no hardware support\n");
14245
+ if (ops->disabled_by_bios()) {
14246
+ printk(KERN_ERR "kvm: disabled by bios\n");
14251
+ kvm_x86_ops = ops;
14252
+ kvm_mmu_set_nonpresent_ptes(0ull, 0ull);
14256
+ kvm_mmu_module_exit();
14261
+void kvm_arch_exit(void)
14263
+ kvm_x86_ops = NULL;
14264
+ kvm_mmu_module_exit();
14267
+int kvm_emulate_halt(struct kvm_vcpu *vcpu)
14269
+ ++vcpu->stat.halt_exits;
14270
+ if (irqchip_in_kernel(vcpu->kvm)) {
14271
+ vcpu->mp_state = VCPU_MP_STATE_HALTED;
14272
+ kvm_vcpu_block(vcpu);
14273
+ if (vcpu->mp_state != VCPU_MP_STATE_RUNNABLE)
14277
+ vcpu->run->exit_reason = KVM_EXIT_HLT;
14281
+EXPORT_SYMBOL_GPL(kvm_emulate_halt);
14283
+int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
14285
+ unsigned long nr, a0, a1, a2, a3, ret;
14287
+ kvm_x86_ops->cache_regs(vcpu);
14289
+ nr = vcpu->regs[VCPU_REGS_RAX];
14290
+ a0 = vcpu->regs[VCPU_REGS_RBX];
14291
+ a1 = vcpu->regs[VCPU_REGS_RCX];
14292
+ a2 = vcpu->regs[VCPU_REGS_RDX];
14293
+ a3 = vcpu->regs[VCPU_REGS_RSI];
14295
+ if (!is_long_mode(vcpu)) {
14296
+ nr &= 0xFFFFFFFF;
14297
+ a0 &= 0xFFFFFFFF;
14298
+ a1 &= 0xFFFFFFFF;
14299
+ a2 &= 0xFFFFFFFF;
14300
+ a3 &= 0xFFFFFFFF;
14305
+ ret = -KVM_ENOSYS;
14308
+ vcpu->regs[VCPU_REGS_RAX] = ret;
14309
+ kvm_x86_ops->decache_regs(vcpu);
14312
+EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
14314
+int kvm_fix_hypercall(struct kvm_vcpu *vcpu)
14316
+ char instruction[3];
14319
+ mutex_lock(&vcpu->kvm->lock);
14322
+ * Blow out the MMU to ensure that no other VCPU has an active mapping
14323
+ * to ensure that the updated hypercall appears atomically across all
14326
+ kvm_mmu_zap_all(vcpu->kvm);
14328
+ kvm_x86_ops->cache_regs(vcpu);
14329
+ kvm_x86_ops->patch_hypercall(vcpu, instruction);
14330
+ if (emulator_write_emulated(vcpu->rip, instruction, 3, vcpu)
14331
+ != X86EMUL_CONTINUE)
14334
+ mutex_unlock(&vcpu->kvm->lock);
14339
+static u64 mk_cr_64(u64 curr_cr, u32 new_val)
14341
+ return (curr_cr & ~((1ULL << 32) - 1)) | new_val;
14344
+void realmode_lgdt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
14346
+ struct descriptor_table dt = { limit, base };
14348
+ kvm_x86_ops->set_gdt(vcpu, &dt);
14351
+void realmode_lidt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
14353
+ struct descriptor_table dt = { limit, base };
14355
+ kvm_x86_ops->set_idt(vcpu, &dt);
14358
+void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
14359
+ unsigned long *rflags)
14362
+ *rflags = kvm_x86_ops->get_rflags(vcpu);
14365
+unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr)
14367
+ kvm_x86_ops->decache_cr4_guest_bits(vcpu);
14370
+ return vcpu->cr0;
14372
+ return vcpu->cr2;
14374
+ return vcpu->cr3;
14376
+ return vcpu->cr4;
14378
+ vcpu_printf(vcpu, "%s: unexpected cr %u\n", __FUNCTION__, cr);
14383
+void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val,
14384
+ unsigned long *rflags)
14388
+ set_cr0(vcpu, mk_cr_64(vcpu->cr0, val));
14389
+ *rflags = kvm_x86_ops->get_rflags(vcpu);
14395
+ set_cr3(vcpu, val);
14398
+ set_cr4(vcpu, mk_cr_64(vcpu->cr4, val));
14401
+ vcpu_printf(vcpu, "%s: unexpected cr %u\n", __FUNCTION__, cr);
14405
+static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
14407
+ struct kvm_cpuid_entry2 *e = &vcpu->cpuid_entries[i];
14408
+ int j, nent = vcpu->cpuid_nent;
14410
+ e->flags &= ~KVM_CPUID_FLAG_STATE_READ_NEXT;
14411
+ /* when no next entry is found, the current entry[i] is reselected */
14412
+ for (j = i + 1; j == i; j = (j + 1) % nent) {
14413
+ struct kvm_cpuid_entry2 *ej = &vcpu->cpuid_entries[j];
14414
+ if (ej->function == e->function) {
14415
+ ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
14419
+ return 0; /* silence gcc, even though control never reaches here */
14422
+/* find an entry with matching function, matching index (if needed), and that
14423
+ * should be read next (if it's stateful) */
14424
+static int is_matching_cpuid_entry(struct kvm_cpuid_entry2 *e,
14425
+ u32 function, u32 index)
14427
+ if (e->function != function)
14429
+ if ((e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) && e->index != index)
14431
+ if ((e->flags & KVM_CPUID_FLAG_STATEFUL_FUNC) &&
14432
+ !(e->flags & KVM_CPUID_FLAG_STATE_READ_NEXT))
14437
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
14440
+ u32 function, index;
14441
+ struct kvm_cpuid_entry2 *e, *best;
14443
+ kvm_x86_ops->cache_regs(vcpu);
14444
+ function = vcpu->regs[VCPU_REGS_RAX];
14445
+ index = vcpu->regs[VCPU_REGS_RCX];
14446
+ vcpu->regs[VCPU_REGS_RAX] = 0;
14447
+ vcpu->regs[VCPU_REGS_RBX] = 0;
14448
+ vcpu->regs[VCPU_REGS_RCX] = 0;
14449
+ vcpu->regs[VCPU_REGS_RDX] = 0;
14451
+ for (i = 0; i < vcpu->cpuid_nent; ++i) {
14452
+ e = &vcpu->cpuid_entries[i];
14453
+ if (is_matching_cpuid_entry(e, function, index)) {
14454
+ if (e->flags & KVM_CPUID_FLAG_STATEFUL_FUNC)
14455
+ move_to_next_stateful_cpuid_entry(vcpu, i);
14460
+ * Both basic or both extended?
14462
+ if (((e->function ^ function) & 0x80000000) == 0)
14463
+ if (!best || e->function > best->function)
14467
+ vcpu->regs[VCPU_REGS_RAX] = best->eax;
14468
+ vcpu->regs[VCPU_REGS_RBX] = best->ebx;
14469
+ vcpu->regs[VCPU_REGS_RCX] = best->ecx;
14470
+ vcpu->regs[VCPU_REGS_RDX] = best->edx;
14472
+ kvm_x86_ops->decache_regs(vcpu);
14473
+ kvm_x86_ops->skip_emulated_instruction(vcpu);
14475
+EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
14478
+ * Check if userspace requested an interrupt window, and that the
14479
+ * interrupt window is open.
14481
+ * No need to exit to userspace if we already have an interrupt queued.
14483
+static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu,
14484
+ struct kvm_run *kvm_run)
14486
+ return (!vcpu->irq_summary &&
14487
+ kvm_run->request_interrupt_window &&
14488
+ vcpu->interrupt_window_open &&
14489
+ (kvm_x86_ops->get_rflags(vcpu) & X86_EFLAGS_IF));
14492
+static void post_kvm_run_save(struct kvm_vcpu *vcpu,
14493
+ struct kvm_run *kvm_run)
14495
+ kvm_run->if_flag = (kvm_x86_ops->get_rflags(vcpu) & X86_EFLAGS_IF) != 0;
14496
+ kvm_run->cr8 = get_cr8(vcpu);
14497
+ kvm_run->apic_base = kvm_get_apic_base(vcpu);
14498
+ if (irqchip_in_kernel(vcpu->kvm))
14499
+ kvm_run->ready_for_interrupt_injection = 1;
14501
+ kvm_run->ready_for_interrupt_injection =
14502
+ (vcpu->interrupt_window_open &&
14503
+ vcpu->irq_summary == 0);
14506
+static int __vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
14510
+ if (unlikely(vcpu->mp_state == VCPU_MP_STATE_SIPI_RECEIVED)) {
14511
+ pr_debug("vcpu %d received sipi with vector # %x\n",
14512
+ vcpu->vcpu_id, vcpu->sipi_vector);
14513
+ kvm_lapic_reset(vcpu);
14514
+ r = kvm_x86_ops->vcpu_reset(vcpu);
14517
+ vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
14521
+ if (vcpu->guest_debug.enabled)
14522
+ kvm_x86_ops->guest_debug_pre(vcpu);
14525
+ r = kvm_mmu_reload(vcpu);
14529
+ kvm_inject_pending_timer_irqs(vcpu);
14531
+ preempt_disable();
14533
+ kvm_x86_ops->prepare_guest_switch(vcpu);
14534
+ kvm_load_guest_fpu(vcpu);
14536
+ local_irq_disable();
14538
+ if (signal_pending(current)) {
14539
+ local_irq_enable();
14540
+ preempt_enable();
14542
+ kvm_run->exit_reason = KVM_EXIT_INTR;
14543
+ ++vcpu->stat.signal_exits;
14547
+ if (irqchip_in_kernel(vcpu->kvm))
14548
+ kvm_x86_ops->inject_pending_irq(vcpu);
14550
+ kvm_x86_ops->inject_pending_vectors(vcpu, kvm_run);
14552
+ vcpu->guest_mode = 1;
14553
+ kvm_guest_enter();
14555
+ if (vcpu->requests)
14556
+ if (test_and_clear_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests))
14557
+ kvm_x86_ops->tlb_flush(vcpu);
14559
+ kvm_x86_ops->run(vcpu, kvm_run);
14561
+ vcpu->guest_mode = 0;
14562
+ local_irq_enable();
14564
+ ++vcpu->stat.exits;
14567
+ * We must have an instruction between local_irq_enable() and
14568
+ * kvm_guest_exit(), so the timer interrupt isn't delayed by
14569
+ * the interrupt shadow. The stat.exits increment will do nicely.
14570
+ * But we need to prevent reordering, hence this barrier():
14574
+ kvm_guest_exit();
14576
+ preempt_enable();
14579
+ * Profile KVM exit RIPs:
14581
+ if (unlikely(prof_on == KVM_PROFILING)) {
14582
+ kvm_x86_ops->cache_regs(vcpu);
14583
+ profile_hit(KVM_PROFILING, (void *)vcpu->rip);
14586
+ r = kvm_x86_ops->handle_exit(kvm_run, vcpu);
14589
+ if (dm_request_for_irq_injection(vcpu, kvm_run)) {
14591
+ kvm_run->exit_reason = KVM_EXIT_INTR;
14592
+ ++vcpu->stat.request_irq_exits;
14595
+ if (!need_resched())
14601
+ kvm_resched(vcpu);
14605
+ post_kvm_run_save(vcpu, kvm_run);
14610
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
14613
+ sigset_t sigsaved;
14617
+ if (unlikely(vcpu->mp_state == VCPU_MP_STATE_UNINITIALIZED)) {
14618
+ kvm_vcpu_block(vcpu);
14623
+ if (vcpu->sigset_active)
14624
+ sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
14626
+ /* re-sync apic's tpr */
14627
+ if (!irqchip_in_kernel(vcpu->kvm))
14628
+ set_cr8(vcpu, kvm_run->cr8);
14630
+ if (vcpu->pio.cur_count) {
14631
+ r = complete_pio(vcpu);
14635
+#if CONFIG_HAS_IOMEM
14636
+ if (vcpu->mmio_needed) {
14637
+ memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
14638
+ vcpu->mmio_read_completed = 1;
14639
+ vcpu->mmio_needed = 0;
14640
+ r = emulate_instruction(vcpu, kvm_run,
14641
+ vcpu->mmio_fault_cr2, 0, 1);
14642
+ if (r == EMULATE_DO_MMIO) {
14644
+ * Read-modify-write. Back to userspace.
14651
+ if (kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
14652
+ kvm_x86_ops->cache_regs(vcpu);
14653
+ vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
14654
+ kvm_x86_ops->decache_regs(vcpu);
14657
+ r = __vcpu_run(vcpu, kvm_run);
14660
+ if (vcpu->sigset_active)
14661
+ sigprocmask(SIG_SETMASK, &sigsaved, NULL);
14667
+int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
14671
+ kvm_x86_ops->cache_regs(vcpu);
14673
+ regs->rax = vcpu->regs[VCPU_REGS_RAX];
14674
+ regs->rbx = vcpu->regs[VCPU_REGS_RBX];
14675
+ regs->rcx = vcpu->regs[VCPU_REGS_RCX];
14676
+ regs->rdx = vcpu->regs[VCPU_REGS_RDX];
14677
+ regs->rsi = vcpu->regs[VCPU_REGS_RSI];
14678
+ regs->rdi = vcpu->regs[VCPU_REGS_RDI];
14679
+ regs->rsp = vcpu->regs[VCPU_REGS_RSP];
14680
+ regs->rbp = vcpu->regs[VCPU_REGS_RBP];
14681
+#ifdef CONFIG_X86_64
14682
+ regs->r8 = vcpu->regs[VCPU_REGS_R8];
14683
+ regs->r9 = vcpu->regs[VCPU_REGS_R9];
14684
+ regs->r10 = vcpu->regs[VCPU_REGS_R10];
14685
+ regs->r11 = vcpu->regs[VCPU_REGS_R11];
14686
+ regs->r12 = vcpu->regs[VCPU_REGS_R12];
14687
+ regs->r13 = vcpu->regs[VCPU_REGS_R13];
14688
+ regs->r14 = vcpu->regs[VCPU_REGS_R14];
14689
+ regs->r15 = vcpu->regs[VCPU_REGS_R15];
14692
+ regs->rip = vcpu->rip;
14693
+ regs->rflags = kvm_x86_ops->get_rflags(vcpu);
14696
+ * Don't leak debug flags in case they were set for guest debugging
14698
+ if (vcpu->guest_debug.enabled && vcpu->guest_debug.singlestep)
14699
+ regs->rflags &= ~(X86_EFLAGS_TF | X86_EFLAGS_RF);
14706
+int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
14710
+ vcpu->regs[VCPU_REGS_RAX] = regs->rax;
14711
+ vcpu->regs[VCPU_REGS_RBX] = regs->rbx;
14712
+ vcpu->regs[VCPU_REGS_RCX] = regs->rcx;
14713
+ vcpu->regs[VCPU_REGS_RDX] = regs->rdx;
14714
+ vcpu->regs[VCPU_REGS_RSI] = regs->rsi;
14715
+ vcpu->regs[VCPU_REGS_RDI] = regs->rdi;
14716
+ vcpu->regs[VCPU_REGS_RSP] = regs->rsp;
14717
+ vcpu->regs[VCPU_REGS_RBP] = regs->rbp;
14718
+#ifdef CONFIG_X86_64
14719
+ vcpu->regs[VCPU_REGS_R8] = regs->r8;
14720
+ vcpu->regs[VCPU_REGS_R9] = regs->r9;
14721
+ vcpu->regs[VCPU_REGS_R10] = regs->r10;
14722
+ vcpu->regs[VCPU_REGS_R11] = regs->r11;
14723
+ vcpu->regs[VCPU_REGS_R12] = regs->r12;
14724
+ vcpu->regs[VCPU_REGS_R13] = regs->r13;
14725
+ vcpu->regs[VCPU_REGS_R14] = regs->r14;
14726
+ vcpu->regs[VCPU_REGS_R15] = regs->r15;
14729
+ vcpu->rip = regs->rip;
14730
+ kvm_x86_ops->set_rflags(vcpu, regs->rflags);
14732
+ kvm_x86_ops->decache_regs(vcpu);
14739
+static void get_segment(struct kvm_vcpu *vcpu,
14740
+ struct kvm_segment *var, int seg)
14742
+ return kvm_x86_ops->get_segment(vcpu, var, seg);
14745
+void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
14747
+ struct kvm_segment cs;
14749
+ get_segment(vcpu, &cs, VCPU_SREG_CS);
14753
+EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
14755
+int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
14756
+ struct kvm_sregs *sregs)
14758
+ struct descriptor_table dt;
14763
+ get_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
14764
+ get_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
14765
+ get_segment(vcpu, &sregs->es, VCPU_SREG_ES);
14766
+ get_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
14767
+ get_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
14768
+ get_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
14770
+ get_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
14771
+ get_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
14773
+ kvm_x86_ops->get_idt(vcpu, &dt);
14774
+ sregs->idt.limit = dt.limit;
14775
+ sregs->idt.base = dt.base;
14776
+ kvm_x86_ops->get_gdt(vcpu, &dt);
14777
+ sregs->gdt.limit = dt.limit;
14778
+ sregs->gdt.base = dt.base;
14780
+ kvm_x86_ops->decache_cr4_guest_bits(vcpu);
14781
+ sregs->cr0 = vcpu->cr0;
14782
+ sregs->cr2 = vcpu->cr2;
14783
+ sregs->cr3 = vcpu->cr3;
14784
+ sregs->cr4 = vcpu->cr4;
14785
+ sregs->cr8 = get_cr8(vcpu);
14786
+ sregs->efer = vcpu->shadow_efer;
14787
+ sregs->apic_base = kvm_get_apic_base(vcpu);
14789
+ if (irqchip_in_kernel(vcpu->kvm)) {
14790
+ memset(sregs->interrupt_bitmap, 0,
14791
+ sizeof sregs->interrupt_bitmap);
14792
+ pending_vec = kvm_x86_ops->get_irq(vcpu);
14793
+ if (pending_vec >= 0)
14794
+ set_bit(pending_vec,
14795
+ (unsigned long *)sregs->interrupt_bitmap);
14797
+ memcpy(sregs->interrupt_bitmap, vcpu->irq_pending,
14798
+ sizeof sregs->interrupt_bitmap);
14805
+static void set_segment(struct kvm_vcpu *vcpu,
14806
+ struct kvm_segment *var, int seg)
14808
+ return kvm_x86_ops->set_segment(vcpu, var, seg);
14811
+int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
14812
+ struct kvm_sregs *sregs)
14814
+ int mmu_reset_needed = 0;
14815
+ int i, pending_vec, max_bits;
14816
+ struct descriptor_table dt;
14820
+ dt.limit = sregs->idt.limit;
14821
+ dt.base = sregs->idt.base;
14822
+ kvm_x86_ops->set_idt(vcpu, &dt);
14823
+ dt.limit = sregs->gdt.limit;
14824
+ dt.base = sregs->gdt.base;
14825
+ kvm_x86_ops->set_gdt(vcpu, &dt);
14827
+ vcpu->cr2 = sregs->cr2;
14828
+ mmu_reset_needed |= vcpu->cr3 != sregs->cr3;
14829
+ vcpu->cr3 = sregs->cr3;
14831
+ set_cr8(vcpu, sregs->cr8);
14833
+ mmu_reset_needed |= vcpu->shadow_efer != sregs->efer;
14834
+#ifdef CONFIG_X86_64
14835
+ kvm_x86_ops->set_efer(vcpu, sregs->efer);
14837
+ kvm_set_apic_base(vcpu, sregs->apic_base);
14839
+ kvm_x86_ops->decache_cr4_guest_bits(vcpu);
14841
+ mmu_reset_needed |= vcpu->cr0 != sregs->cr0;
14842
+ vcpu->cr0 = sregs->cr0;
14843
+ kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
14845
+ mmu_reset_needed |= vcpu->cr4 != sregs->cr4;
14846
+ kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
14847
+ if (!is_long_mode(vcpu) && is_pae(vcpu))
14848
+ load_pdptrs(vcpu, vcpu->cr3);
14850
+ if (mmu_reset_needed)
14851
+ kvm_mmu_reset_context(vcpu);
14853
+ if (!irqchip_in_kernel(vcpu->kvm)) {
14854
+ memcpy(vcpu->irq_pending, sregs->interrupt_bitmap,
14855
+ sizeof vcpu->irq_pending);
14856
+ vcpu->irq_summary = 0;
14857
+ for (i = 0; i < ARRAY_SIZE(vcpu->irq_pending); ++i)
14858
+ if (vcpu->irq_pending[i])
14859
+ __set_bit(i, &vcpu->irq_summary);
14861
+ max_bits = (sizeof sregs->interrupt_bitmap) << 3;
14862
+ pending_vec = find_first_bit(
14863
+ (const unsigned long *)sregs->interrupt_bitmap,
14865
+ /* Only pending external irq is handled here */
14866
+ if (pending_vec < max_bits) {
14867
+ kvm_x86_ops->set_irq(vcpu, pending_vec);
14868
+ pr_debug("Set back pending irq %d\n",
14873
+ set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
14874
+ set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
14875
+ set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
14876
+ set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
14877
+ set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
14878
+ set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
14880
+ set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
14881
+ set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
14888
+int kvm_arch_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu,
14889
+ struct kvm_debug_guest *dbg)
14895
+ r = kvm_x86_ops->set_guest_debug(vcpu, dbg);
14903
+ * fxsave fpu state. Taken from x86_64/processor.h. To be killed when
14904
+ * we have asm/x86/processor.h
14915
+ u32 st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
14916
+#ifdef CONFIG_X86_64
14917
+ u32 xmm_space[64]; /* 16*16 bytes for each XMM-reg = 256 bytes */
14919
+ u32 xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
14924
+ * Translate a guest virtual address to a guest physical address.
14926
+int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
14927
+ struct kvm_translation *tr)
14929
+ unsigned long vaddr = tr->linear_address;
14933
+ mutex_lock(&vcpu->kvm->lock);
14934
+ gpa = vcpu->mmu.gva_to_gpa(vcpu, vaddr);
14935
+ tr->physical_address = gpa;
14936
+ tr->valid = gpa != UNMAPPED_GVA;
14937
+ tr->writeable = 1;
14938
+ tr->usermode = 0;
14939
+ mutex_unlock(&vcpu->kvm->lock);
14945
+int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
14947
+ struct fxsave *fxsave = (struct fxsave *)&vcpu->guest_fx_image;
14951
+ memcpy(fpu->fpr, fxsave->st_space, 128);
14952
+ fpu->fcw = fxsave->cwd;
14953
+ fpu->fsw = fxsave->swd;
14954
+ fpu->ftwx = fxsave->twd;
14955
+ fpu->last_opcode = fxsave->fop;
14956
+ fpu->last_ip = fxsave->rip;
14957
+ fpu->last_dp = fxsave->rdp;
14958
+ memcpy(fpu->xmm, fxsave->xmm_space, sizeof fxsave->xmm_space);
14965
+int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
14967
+ struct fxsave *fxsave = (struct fxsave *)&vcpu->guest_fx_image;
14971
+ memcpy(fxsave->st_space, fpu->fpr, 128);
14972
+ fxsave->cwd = fpu->fcw;
14973
+ fxsave->swd = fpu->fsw;
14974
+ fxsave->twd = fpu->ftwx;
14975
+ fxsave->fop = fpu->last_opcode;
14976
+ fxsave->rip = fpu->last_ip;
14977
+ fxsave->rdp = fpu->last_dp;
14978
+ memcpy(fxsave->xmm_space, fpu->xmm, sizeof fxsave->xmm_space);
14985
+void fx_init(struct kvm_vcpu *vcpu)
14987
+ unsigned after_mxcsr_mask;
14989
+ /* Initialize guest FPU by resetting ours and saving into guest's */
14990
+ preempt_disable();
14991
+ fx_save(&vcpu->host_fx_image);
14993
+ fx_save(&vcpu->guest_fx_image);
14994
+ fx_restore(&vcpu->host_fx_image);
14995
+ preempt_enable();
14997
+ vcpu->cr0 |= X86_CR0_ET;
14998
+ after_mxcsr_mask = offsetof(struct i387_fxsave_struct, st_space);
14999
+ vcpu->guest_fx_image.mxcsr = 0x1f80;
15000
+ memset((void *)&vcpu->guest_fx_image + after_mxcsr_mask,
15001
+ 0, sizeof(struct i387_fxsave_struct) - after_mxcsr_mask);
15003
+EXPORT_SYMBOL_GPL(fx_init);
15005
+void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
15007
+ if (!vcpu->fpu_active || vcpu->guest_fpu_loaded)
15010
+ vcpu->guest_fpu_loaded = 1;
15011
+ fx_save(&vcpu->host_fx_image);
15012
+ fx_restore(&vcpu->guest_fx_image);
15014
+EXPORT_SYMBOL_GPL(kvm_load_guest_fpu);
15016
+void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
15018
+ if (!vcpu->guest_fpu_loaded)
15021
+ vcpu->guest_fpu_loaded = 0;
15022
+ fx_save(&vcpu->guest_fx_image);
15023
+ fx_restore(&vcpu->host_fx_image);
15024
+ ++vcpu->stat.fpu_reload;
15026
+EXPORT_SYMBOL_GPL(kvm_put_guest_fpu);
15028
+void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
15030
+ kvm_x86_ops->vcpu_free(vcpu);
15033
+struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
15036
+ return kvm_x86_ops->vcpu_create(kvm, id);
15039
+int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
15043
+ /* We do fxsave: this must be aligned. */
15044
+ BUG_ON((unsigned long)&vcpu->host_fx_image & 0xF);
15047
+ r = kvm_arch_vcpu_reset(vcpu);
15049
+ r = kvm_mmu_setup(vcpu);
15056
+ kvm_x86_ops->vcpu_free(vcpu);
15060
+void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
15063
+ kvm_mmu_unload(vcpu);
15066
+ kvm_x86_ops->vcpu_free(vcpu);
15069
+int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu)
15071
+ return kvm_x86_ops->vcpu_reset(vcpu);
15074
+void kvm_arch_hardware_enable(void *garbage)
15076
+ kvm_x86_ops->hardware_enable(garbage);
15079
+void kvm_arch_hardware_disable(void *garbage)
15081
+ kvm_x86_ops->hardware_disable(garbage);
15084
+int kvm_arch_hardware_setup(void)
15086
+ return kvm_x86_ops->hardware_setup();
15089
+void kvm_arch_hardware_unsetup(void)
15091
+ kvm_x86_ops->hardware_unsetup();
15094
+void kvm_arch_check_processor_compat(void *rtn)
15096
+ kvm_x86_ops->check_processor_compatibility(rtn);
15099
+int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
15101
+ struct page *page;
15105
+ BUG_ON(vcpu->kvm == NULL);
15108
+ vcpu->mmu.root_hpa = INVALID_PAGE;
15109
+ if (!irqchip_in_kernel(kvm) || vcpu->vcpu_id == 0)
15110
+ vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
15112
+ vcpu->mp_state = VCPU_MP_STATE_UNINITIALIZED;
15114
+ page = alloc_page(GFP_KERNEL | __GFP_ZERO);
15119
+ vcpu->pio_data = page_address(page);
15121
+ r = kvm_mmu_create(vcpu);
15123
+ goto fail_free_pio_data;
15125
+ if (irqchip_in_kernel(kvm)) {
15126
+ r = kvm_create_lapic(vcpu);
15128
+ goto fail_mmu_destroy;
15134
+ kvm_mmu_destroy(vcpu);
15135
+fail_free_pio_data:
15136
+ free_page((unsigned long)vcpu->pio_data);
15141
+void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
15143
+ kvm_free_lapic(vcpu);
15144
+ kvm_mmu_destroy(vcpu);
15145
+ free_page((unsigned long)vcpu->pio_data);
15148
+struct kvm *kvm_arch_create_vm(void)
15150
+ struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
15153
+ return ERR_PTR(-ENOMEM);
15155
+ INIT_LIST_HEAD(&kvm->active_mmu_pages);
15160
+static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
15163
+ kvm_mmu_unload(vcpu);
15167
+static void kvm_free_vcpus(struct kvm *kvm)
15172
+ * Unpin any mmu pages first.
15174
+ for (i = 0; i < KVM_MAX_VCPUS; ++i)
15175
+ if (kvm->vcpus[i])
15176
+ kvm_unload_vcpu_mmu(kvm->vcpus[i]);
15177
+ for (i = 0; i < KVM_MAX_VCPUS; ++i) {
15178
+ if (kvm->vcpus[i]) {
15179
+ kvm_arch_vcpu_free(kvm->vcpus[i]);
15180
+ kvm->vcpus[i] = NULL;
15186
+void kvm_arch_destroy_vm(struct kvm *kvm)
15188
+ kfree(kvm->vpic);
15189
+ kfree(kvm->vioapic);
15190
+ kvm_free_vcpus(kvm);
15191
+ kvm_free_physmem(kvm);
15195
+int kvm_arch_set_memory_region(struct kvm *kvm,
15196
+ struct kvm_userspace_memory_region *mem,
15197
+ struct kvm_memory_slot old,
15200
+ int npages = mem->memory_size >> PAGE_SHIFT;
15201
+ struct kvm_memory_slot *memslot = &kvm->memslots[mem->slot];
15203
+ /*To keep backward compatibility with older userspace,
15204
+ *x86 needs to hanlde !user_alloc case.
15206
+ if (!user_alloc) {
15207
+ if (npages && !old.rmap) {
15208
+ down_write(¤t->mm->mmap_sem);
15209
+ memslot->userspace_addr = do_mmap(NULL, 0,
15210
+ npages * PAGE_SIZE,
15211
+ PROT_READ | PROT_WRITE,
15212
+ MAP_SHARED | MAP_ANONYMOUS,
15214
+ up_write(¤t->mm->mmap_sem);
15216
+ if (IS_ERR((void *)memslot->userspace_addr))
15217
+ return PTR_ERR((void *)memslot->userspace_addr);
15219
+ if (!old.user_alloc && old.rmap) {
15222
+ down_write(¤t->mm->mmap_sem);
15223
+ ret = do_munmap(current->mm, old.userspace_addr,
15224
+ old.npages * PAGE_SIZE);
15225
+ up_write(¤t->mm->mmap_sem);
15227
+ printk(KERN_WARNING
15228
+ "kvm_vm_ioctl_set_memory_region: "
15229
+ "failed to munmap memory\n");
15234
+ if (!kvm->n_requested_mmu_pages) {
15235
+ unsigned int nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm);
15236
+ kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
15239
+ kvm_mmu_slot_remove_write_access(kvm, mem->slot);
15240
+ kvm_flush_remote_tlbs(kvm);
15244
diff --git a/drivers/kvm/x86.h b/drivers/kvm/x86.h
15245
new file mode 100644
15246
index 0000000..78ab1e1
15248
+++ b/drivers/kvm/x86.h
15251
+ * Kernel-based Virtual Machine driver for Linux
15253
+ * This header defines architecture specific interfaces, x86 version
15255
+ * This work is licensed under the terms of the GNU GPL, version 2. See
15256
+ * the COPYING file in the top-level directory.
15265
+#include <linux/types.h>
15266
+#include <linux/mm.h>
15268
+#include <linux/kvm.h>
15269
+#include <linux/kvm_para.h>
15271
+#define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1)
15272
+#define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD))
15273
+#define CR3_L_MODE_RESERVED_BITS (CR3_NONPAE_RESERVED_BITS|0xFFFFFF0000000000ULL)
15275
+#define KVM_GUEST_CR0_MASK \
15276
+ (X86_CR0_PG | X86_CR0_PE | X86_CR0_WP | X86_CR0_NE \
15277
+ | X86_CR0_NW | X86_CR0_CD)
15278
+#define KVM_VM_CR0_ALWAYS_ON \
15279
+ (X86_CR0_PG | X86_CR0_PE | X86_CR0_WP | X86_CR0_NE | X86_CR0_TS \
15281
+#define KVM_GUEST_CR4_MASK \
15282
+ (X86_CR4_VME | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_PGE | X86_CR4_VMXE)
15283
+#define KVM_PMODE_VM_CR4_ALWAYS_ON (X86_CR4_PAE | X86_CR4_VMXE)
15284
+#define KVM_RMODE_VM_CR4_ALWAYS_ON (X86_CR4_VME | X86_CR4_PAE | X86_CR4_VMXE)
15286
+#define INVALID_PAGE (~(hpa_t)0)
15287
+#define UNMAPPED_GVA (~(gpa_t)0)
15289
+#define DE_VECTOR 0
15290
+#define UD_VECTOR 6
15291
+#define NM_VECTOR 7
15292
+#define DF_VECTOR 8
15293
+#define TS_VECTOR 10
15294
+#define NP_VECTOR 11
15295
+#define SS_VECTOR 12
15296
+#define GP_VECTOR 13
15297
+#define PF_VECTOR 14
15299
+#define SELECTOR_TI_MASK (1 << 2)
15300
+#define SELECTOR_RPL_MASK 0x03
15302
+#define IOPL_SHIFT 12
15304
+extern spinlock_t kvm_lock;
15305
+extern struct list_head vm_list;
15308
+ VCPU_REGS_RAX = 0,
15309
+ VCPU_REGS_RCX = 1,
15310
+ VCPU_REGS_RDX = 2,
15311
+ VCPU_REGS_RBX = 3,
15312
+ VCPU_REGS_RSP = 4,
15313
+ VCPU_REGS_RBP = 5,
15314
+ VCPU_REGS_RSI = 6,
15315
+ VCPU_REGS_RDI = 7,
15316
+#ifdef CONFIG_X86_64
15317
+ VCPU_REGS_R8 = 8,
15318
+ VCPU_REGS_R9 = 9,
15319
+ VCPU_REGS_R10 = 10,
15320
+ VCPU_REGS_R11 = 11,
15321
+ VCPU_REGS_R12 = 12,
15322
+ VCPU_REGS_R13 = 13,
15323
+ VCPU_REGS_R14 = 14,
15324
+ VCPU_REGS_R15 = 15,
15340
+#include "x86_emulate.h"
15345
+ int interrupt_window_open;
15346
+ unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
15347
+ DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS);
15348
+ unsigned long regs[NR_VCPU_REGS]; /* for rsp: vcpu_load_rsp_rip() */
15349
+ unsigned long rip; /* needs vcpu_load_rsp_rip() */
15351
+ unsigned long cr0;
15352
+ unsigned long cr2;
15353
+ unsigned long cr3;
15354
+ unsigned long cr4;
15355
+ unsigned long cr8;
15356
+ u64 pdptrs[4]; /* pae */
15359
+ struct kvm_lapic *apic; /* kernel irqchip context */
15360
+#define VCPU_MP_STATE_RUNNABLE 0
15361
+#define VCPU_MP_STATE_UNINITIALIZED 1
15362
+#define VCPU_MP_STATE_INIT_RECEIVED 2
15363
+#define VCPU_MP_STATE_SIPI_RECEIVED 3
15364
+#define VCPU_MP_STATE_HALTED 4
15367
+ u64 ia32_misc_enable_msr;
15369
+ struct kvm_mmu mmu;
15371
+ struct kvm_mmu_memory_cache mmu_pte_chain_cache;
15372
+ struct kvm_mmu_memory_cache mmu_rmap_desc_cache;
15373
+ struct kvm_mmu_memory_cache mmu_page_cache;
15374
+ struct kvm_mmu_memory_cache mmu_page_header_cache;
15376
+ gfn_t last_pt_write_gfn;
15377
+ int last_pt_write_count;
15378
+ u64 *last_pte_updated;
15381
+ struct i387_fxsave_struct host_fx_image;
15382
+ struct i387_fxsave_struct guest_fx_image;
15384
+ gva_t mmio_fault_cr2;
15385
+ struct kvm_pio_request pio;
15391
+ struct kvm_save_segment {
15393
+ unsigned long base;
15396
+ } tr, es, ds, fs, gs;
15398
+ int halt_request; /* real mode on Intel only */
15401
+ struct kvm_cpuid_entry2 cpuid_entries[KVM_MAX_CPUID_ENTRIES];
15403
+ /* emulate context */
15405
+ struct x86_emulate_ctxt emulate_ctxt;
15408
+struct kvm_x86_ops {
15409
+ int (*cpu_has_kvm_support)(void); /* __init */
15410
+ int (*disabled_by_bios)(void); /* __init */
15411
+ void (*hardware_enable)(void *dummy); /* __init */
15412
+ void (*hardware_disable)(void *dummy);
15413
+ void (*check_processor_compatibility)(void *rtn);
15414
+ int (*hardware_setup)(void); /* __init */
15415
+ void (*hardware_unsetup)(void); /* __exit */
15417
+ /* Create, but do not attach this VCPU */
15418
+ struct kvm_vcpu *(*vcpu_create)(struct kvm *kvm, unsigned id);
15419
+ void (*vcpu_free)(struct kvm_vcpu *vcpu);
15420
+ int (*vcpu_reset)(struct kvm_vcpu *vcpu);
15422
+ void (*prepare_guest_switch)(struct kvm_vcpu *vcpu);
15423
+ void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
15424
+ void (*vcpu_put)(struct kvm_vcpu *vcpu);
15425
+ void (*vcpu_decache)(struct kvm_vcpu *vcpu);
15427
+ int (*set_guest_debug)(struct kvm_vcpu *vcpu,
15428
+ struct kvm_debug_guest *dbg);
15429
+ void (*guest_debug_pre)(struct kvm_vcpu *vcpu);
15430
+ int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
15431
+ int (*set_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
15432
+ u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg);
15433
+ void (*get_segment)(struct kvm_vcpu *vcpu,
15434
+ struct kvm_segment *var, int seg);
15435
+ void (*set_segment)(struct kvm_vcpu *vcpu,
15436
+ struct kvm_segment *var, int seg);
15437
+ void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
15438
+ void (*decache_cr4_guest_bits)(struct kvm_vcpu *vcpu);
15439
+ void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
15440
+ void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
15441
+ void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
15442
+ void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
15443
+ void (*get_idt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
15444
+ void (*set_idt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
15445
+ void (*get_gdt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
15446
+ void (*set_gdt)(struct kvm_vcpu *vcpu, struct descriptor_table *dt);
15447
+ unsigned long (*get_dr)(struct kvm_vcpu *vcpu, int dr);
15448
+ void (*set_dr)(struct kvm_vcpu *vcpu, int dr, unsigned long value,
15450
+ void (*cache_regs)(struct kvm_vcpu *vcpu);
15451
+ void (*decache_regs)(struct kvm_vcpu *vcpu);
15452
+ unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);
15453
+ void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
15455
+ void (*tlb_flush)(struct kvm_vcpu *vcpu);
15456
+ void (*inject_page_fault)(struct kvm_vcpu *vcpu,
15457
+ unsigned long addr, u32 err_code);
15459
+ void (*inject_gp)(struct kvm_vcpu *vcpu, unsigned err_code);
15461
+ void (*run)(struct kvm_vcpu *vcpu, struct kvm_run *run);
15462
+ int (*handle_exit)(struct kvm_run *run, struct kvm_vcpu *vcpu);
15463
+ void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
15464
+ void (*patch_hypercall)(struct kvm_vcpu *vcpu,
15465
+ unsigned char *hypercall_addr);
15466
+ int (*get_irq)(struct kvm_vcpu *vcpu);
15467
+ void (*set_irq)(struct kvm_vcpu *vcpu, int vec);
15468
+ void (*inject_pending_irq)(struct kvm_vcpu *vcpu);
15469
+ void (*inject_pending_vectors)(struct kvm_vcpu *vcpu,
15470
+ struct kvm_run *run);
15472
+ int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
15475
+extern struct kvm_x86_ops *kvm_x86_ops;
15477
+int kvm_mmu_module_init(void);
15478
+void kvm_mmu_module_exit(void);
15480
+void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
15481
+int kvm_mmu_create(struct kvm_vcpu *vcpu);
15482
+int kvm_mmu_setup(struct kvm_vcpu *vcpu);
15483
+void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte);
15485
+int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
15486
+void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
15487
+void kvm_mmu_zap_all(struct kvm *kvm);
15488
+unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
15489
+void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
15491
+enum emulation_result {
15492
+ EMULATE_DONE, /* no further processing */
15493
+ EMULATE_DO_MMIO, /* kvm_run filled with mmio request */
15494
+ EMULATE_FAIL, /* can't emulate this instruction */
15497
+int emulate_instruction(struct kvm_vcpu *vcpu, struct kvm_run *run,
15498
+ unsigned long cr2, u16 error_code, int no_decode);
15499
+void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char *context);
15500
+void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
15501
+void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
15502
+void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
15503
+ unsigned long *rflags);
15505
+unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr);
15506
+void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
15507
+ unsigned long *rflags);
15508
+int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data);
15509
+int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
15511
+struct x86_emulate_ctxt;
15513
+int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
15514
+ int size, unsigned port);
15515
+int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
15516
+ int size, unsigned long count, int down,
15517
+ gva_t address, int rep, unsigned port);
15518
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
15519
+int kvm_emulate_halt(struct kvm_vcpu *vcpu);
15520
+int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
15521
+int emulate_clts(struct kvm_vcpu *vcpu);
15522
+int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr,
15523
+ unsigned long *dest);
15524
+int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr,
15525
+ unsigned long value);
15527
+void set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
15528
+void set_cr3(struct kvm_vcpu *vcpu, unsigned long cr0);
15529
+void set_cr4(struct kvm_vcpu *vcpu, unsigned long cr0);
15530
+void set_cr8(struct kvm_vcpu *vcpu, unsigned long cr0);
15531
+unsigned long get_cr8(struct kvm_vcpu *vcpu);
15532
+void lmsw(struct kvm_vcpu *vcpu, unsigned long msw);
15533
+void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l);
15535
+int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
15536
+int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data);
15538
+void fx_init(struct kvm_vcpu *vcpu);
15540
+int emulator_read_std(unsigned long addr,
15542
+ unsigned int bytes,
15543
+ struct kvm_vcpu *vcpu);
15544
+int emulator_write_emulated(unsigned long addr,
15546
+ unsigned int bytes,
15547
+ struct kvm_vcpu *vcpu);
15549
+unsigned long segment_base(u16 selector);
15551
+void kvm_mmu_flush_tlb(struct kvm_vcpu *vcpu);
15552
+void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
15553
+ const u8 *new, int bytes);
15554
+int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
15555
+void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
15556
+int kvm_mmu_load(struct kvm_vcpu *vcpu);
15557
+void kvm_mmu_unload(struct kvm_vcpu *vcpu);
15559
+int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
15561
+int kvm_fix_hypercall(struct kvm_vcpu *vcpu);
15563
+int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code);
15565
+static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
15567
+ if (unlikely(vcpu->kvm->n_free_mmu_pages < KVM_MIN_FREE_MMU_PAGES))
15568
+ __kvm_mmu_free_some_pages(vcpu);
15571
+static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
15573
+ if (likely(vcpu->mmu.root_hpa != INVALID_PAGE))
15576
+ return kvm_mmu_load(vcpu);
15579
+static inline int is_long_mode(struct kvm_vcpu *vcpu)
15581
+#ifdef CONFIG_X86_64
15582
+ return vcpu->shadow_efer & EFER_LME;
15588
+static inline int is_pae(struct kvm_vcpu *vcpu)
15590
+ return vcpu->cr4 & X86_CR4_PAE;
15593
+static inline int is_pse(struct kvm_vcpu *vcpu)
15595
+ return vcpu->cr4 & X86_CR4_PSE;
15598
+static inline int is_paging(struct kvm_vcpu *vcpu)
15600
+ return vcpu->cr0 & X86_CR0_PG;
15603
+int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3);
15604
+int complete_pio(struct kvm_vcpu *vcpu);
15606
+static inline struct kvm_mmu_page *page_header(hpa_t shadow_page)
15608
+ struct page *page = pfn_to_page(shadow_page >> PAGE_SHIFT);
15610
+ return (struct kvm_mmu_page *)page_private(page);
15613
+static inline u16 read_fs(void)
15616
+ asm("mov %%fs, %0" : "=g"(seg));
15620
+static inline u16 read_gs(void)
15623
+ asm("mov %%gs, %0" : "=g"(seg));
15627
+static inline u16 read_ldt(void)
15630
+ asm("sldt %0" : "=g"(ldt));
15634
+static inline void load_fs(u16 sel)
15636
+ asm("mov %0, %%fs" : : "rm"(sel));
15639
+static inline void load_gs(u16 sel)
15641
+ asm("mov %0, %%gs" : : "rm"(sel));
15645
+static inline void load_ldt(u16 sel)
15647
+ asm("lldt %0" : : "rm"(sel));
15651
+static inline void get_idt(struct descriptor_table *table)
15653
+ asm("sidt %0" : "=m"(*table));
15656
+static inline void get_gdt(struct descriptor_table *table)
15658
+ asm("sgdt %0" : "=m"(*table));
15661
+static inline unsigned long read_tr_base(void)
15664
+ asm("str %0" : "=g"(tr));
15665
+ return segment_base(tr);
15668
+#ifdef CONFIG_X86_64
15669
+static inline unsigned long read_msr(unsigned long msr)
15673
+ rdmsrl(msr, value);
15678
+static inline void fx_save(struct i387_fxsave_struct *image)
15680
+ asm("fxsave (%0)":: "r" (image));
15683
+static inline void fx_restore(struct i387_fxsave_struct *image)
15685
+ asm("fxrstor (%0)":: "r" (image));
15688
+static inline void fpu_init(void)
15693
+static inline u32 get_rdx_init_val(void)
15695
+ return 0x600; /* P6 family */
15698
+#define ASM_VMX_VMCLEAR_RAX ".byte 0x66, 0x0f, 0xc7, 0x30"
15699
+#define ASM_VMX_VMLAUNCH ".byte 0x0f, 0x01, 0xc2"
15700
+#define ASM_VMX_VMRESUME ".byte 0x0f, 0x01, 0xc3"
15701
+#define ASM_VMX_VMPTRLD_RAX ".byte 0x0f, 0xc7, 0x30"
15702
+#define ASM_VMX_VMREAD_RDX_RAX ".byte 0x0f, 0x78, 0xd0"
15703
+#define ASM_VMX_VMWRITE_RAX_RDX ".byte 0x0f, 0x79, 0xd0"
15704
+#define ASM_VMX_VMWRITE_RSP_RDX ".byte 0x0f, 0x79, 0xd4"
15705
+#define ASM_VMX_VMXOFF ".byte 0x0f, 0x01, 0xc4"
15706
+#define ASM_VMX_VMXON_RAX ".byte 0xf3, 0x0f, 0xc7, 0x30"
15708
+#define MSR_IA32_TIME_STAMP_COUNTER 0x010
15710
+#define TSS_IOPB_BASE_OFFSET 0x66
15711
+#define TSS_BASE_SIZE 0x68
15712
+#define TSS_IOPB_SIZE (65536 / 8)
15713
+#define TSS_REDIRECTION_SIZE (256 / 8)
15714
+#define RMODE_TSS_SIZE (TSS_BASE_SIZE + TSS_REDIRECTION_SIZE + TSS_IOPB_SIZE + 1)
15716
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
15717
index bd46de6..f2a4708 100644
15718
--- a/drivers/kvm/x86_emulate.c
15719
+++ b/drivers/kvm/x86_emulate.c
15722
#include <stdint.h>
15723
#include <public/xen.h>
15724
-#define DPRINTF(_f, _a ...) printf( _f , ## _a )
15725
+#define DPRINTF(_f, _a ...) printf(_f , ## _a)
15729
#define DPRINTF(x...) do {} while (0)
15731
#include "x86_emulate.h"
15733
/* Destination is only written; never read. */
15735
#define BitOp (1<<8)
15736
+#define MemAbs (1<<9) /* Memory operand is absolute displacement */
15737
+#define String (1<<10) /* String instruction (rep capable) */
15739
-static u8 opcode_table[256] = {
15740
+static u16 opcode_table[256] = {
15742
ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM,
15743
ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM,
15744
@@ -96,14 +99,14 @@ static u8 opcode_table[256] = {
15745
ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM,
15746
ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM,
15748
- /* 0x40 - 0x4F */
15749
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
15750
+ /* 0x40 - 0x47 */
15751
+ DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg,
15752
+ /* 0x48 - 0x4F */
15753
+ DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg,
15755
- ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
15756
- ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
15757
+ SrcReg, SrcReg, SrcReg, SrcReg, SrcReg, SrcReg, SrcReg, SrcReg,
15759
- ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
15760
- ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
15761
+ DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg,
15763
0, 0, 0, DstReg | SrcMem32 | ModRM | Mov /* movsxd (x86/64) */ ,
15765
@@ -129,14 +132,14 @@ static u8 opcode_table[256] = {
15767
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps, ImplicitOps, 0, 0,
15769
- ByteOp | DstReg | SrcMem | Mov, DstReg | SrcMem | Mov,
15770
- ByteOp | DstMem | SrcReg | Mov, DstMem | SrcReg | Mov,
15771
- ByteOp | ImplicitOps | Mov, ImplicitOps | Mov,
15772
- ByteOp | ImplicitOps, ImplicitOps,
15773
+ ByteOp | DstReg | SrcMem | Mov | MemAbs, DstReg | SrcMem | Mov | MemAbs,
15774
+ ByteOp | DstMem | SrcReg | Mov | MemAbs, DstMem | SrcReg | Mov | MemAbs,
15775
+ ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String,
15776
+ ByteOp | ImplicitOps | String, ImplicitOps | String,
15778
- 0, 0, ByteOp | ImplicitOps | Mov, ImplicitOps | Mov,
15779
- ByteOp | ImplicitOps | Mov, ImplicitOps | Mov,
15780
- ByteOp | ImplicitOps, ImplicitOps,
15781
+ 0, 0, ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String,
15782
+ ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String,
15783
+ ByteOp | ImplicitOps | String, ImplicitOps | String,
15785
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
15787
@@ -157,10 +160,10 @@ static u8 opcode_table[256] = {
15788
ImplicitOps, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, 0, 0, 0, 0,
15792
+ ImplicitOps, ImplicitOps,
15793
ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM,
15796
+ ImplicitOps, 0, ImplicitOps, ImplicitOps,
15797
0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM
15800
@@ -222,13 +225,6 @@ static u16 twobyte_table[256] = {
15801
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
15804
-/* Type, address-of, and value of an instruction's operand. */
15806
- enum { OP_REG, OP_MEM, OP_IMM } type;
15807
- unsigned int bytes;
15808
- unsigned long val, orig_val, *ptr;
15811
/* EFLAGS bit definitions. */
15812
#define EFLG_OF (1<<11)
15813
#define EFLG_DF (1<<10)
15814
@@ -292,21 +288,21 @@ struct operand {
15815
switch ((_dst).bytes) { \
15817
__asm__ __volatile__ ( \
15818
- _PRE_EFLAGS("0","4","2") \
15819
+ _PRE_EFLAGS("0", "4", "2") \
15820
_op"w %"_wx"3,%1; " \
15821
- _POST_EFLAGS("0","4","2") \
15822
+ _POST_EFLAGS("0", "4", "2") \
15823
: "=m" (_eflags), "=m" ((_dst).val), \
15825
- : _wy ((_src).val), "i" (EFLAGS_MASK) ); \
15826
+ : _wy ((_src).val), "i" (EFLAGS_MASK)); \
15829
__asm__ __volatile__ ( \
15830
- _PRE_EFLAGS("0","4","2") \
15831
+ _PRE_EFLAGS("0", "4", "2") \
15832
_op"l %"_lx"3,%1; " \
15833
- _POST_EFLAGS("0","4","2") \
15834
+ _POST_EFLAGS("0", "4", "2") \
15835
: "=m" (_eflags), "=m" ((_dst).val), \
15837
- : _ly ((_src).val), "i" (EFLAGS_MASK) ); \
15838
+ : _ly ((_src).val), "i" (EFLAGS_MASK)); \
15841
__emulate_2op_8byte(_op, _src, _dst, \
15842
@@ -318,16 +314,15 @@ struct operand {
15843
#define __emulate_2op(_op,_src,_dst,_eflags,_bx,_by,_wx,_wy,_lx,_ly,_qx,_qy) \
15845
unsigned long _tmp; \
15846
- switch ( (_dst).bytes ) \
15848
+ switch ((_dst).bytes) { \
15850
__asm__ __volatile__ ( \
15851
- _PRE_EFLAGS("0","4","2") \
15852
+ _PRE_EFLAGS("0", "4", "2") \
15853
_op"b %"_bx"3,%1; " \
15854
- _POST_EFLAGS("0","4","2") \
15855
+ _POST_EFLAGS("0", "4", "2") \
15856
: "=m" (_eflags), "=m" ((_dst).val), \
15858
- : _by ((_src).val), "i" (EFLAGS_MASK) ); \
15859
+ : _by ((_src).val), "i" (EFLAGS_MASK)); \
15862
__emulate_2op_nobyte(_op, _src, _dst, _eflags, \
15863
@@ -356,34 +351,33 @@ struct operand {
15865
unsigned long _tmp; \
15867
- switch ( (_dst).bytes ) \
15869
+ switch ((_dst).bytes) { \
15871
__asm__ __volatile__ ( \
15872
- _PRE_EFLAGS("0","3","2") \
15873
+ _PRE_EFLAGS("0", "3", "2") \
15875
- _POST_EFLAGS("0","3","2") \
15876
+ _POST_EFLAGS("0", "3", "2") \
15877
: "=m" (_eflags), "=m" ((_dst).val), \
15879
- : "i" (EFLAGS_MASK) ); \
15880
+ : "i" (EFLAGS_MASK)); \
15883
__asm__ __volatile__ ( \
15884
- _PRE_EFLAGS("0","3","2") \
15885
+ _PRE_EFLAGS("0", "3", "2") \
15887
- _POST_EFLAGS("0","3","2") \
15888
+ _POST_EFLAGS("0", "3", "2") \
15889
: "=m" (_eflags), "=m" ((_dst).val), \
15891
- : "i" (EFLAGS_MASK) ); \
15892
+ : "i" (EFLAGS_MASK)); \
15895
__asm__ __volatile__ ( \
15896
- _PRE_EFLAGS("0","3","2") \
15897
+ _PRE_EFLAGS("0", "3", "2") \
15899
- _POST_EFLAGS("0","3","2") \
15900
+ _POST_EFLAGS("0", "3", "2") \
15901
: "=m" (_eflags), "=m" ((_dst).val), \
15903
- : "i" (EFLAGS_MASK) ); \
15904
+ : "i" (EFLAGS_MASK)); \
15907
__emulate_1op_8byte(_op, _dst, _eflags); \
15908
@@ -396,21 +390,21 @@ struct operand {
15909
#define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy) \
15911
__asm__ __volatile__ ( \
15912
- _PRE_EFLAGS("0","4","2") \
15913
+ _PRE_EFLAGS("0", "4", "2") \
15914
_op"q %"_qx"3,%1; " \
15915
- _POST_EFLAGS("0","4","2") \
15916
+ _POST_EFLAGS("0", "4", "2") \
15917
: "=m" (_eflags), "=m" ((_dst).val), "=&r" (_tmp) \
15918
- : _qy ((_src).val), "i" (EFLAGS_MASK) ); \
15919
+ : _qy ((_src).val), "i" (EFLAGS_MASK)); \
15922
#define __emulate_1op_8byte(_op, _dst, _eflags) \
15924
__asm__ __volatile__ ( \
15925
- _PRE_EFLAGS("0","3","2") \
15926
+ _PRE_EFLAGS("0", "3", "2") \
15928
- _POST_EFLAGS("0","3","2") \
15929
+ _POST_EFLAGS("0", "3", "2") \
15930
: "=m" (_eflags), "=m" ((_dst).val), "=&r" (_tmp) \
15931
- : "i" (EFLAGS_MASK) ); \
15932
+ : "i" (EFLAGS_MASK)); \
15935
#elif defined(__i386__)
15936
@@ -421,9 +415,8 @@ struct operand {
15937
/* Fetch next part of the instruction being emulated. */
15938
#define insn_fetch(_type, _size, _eip) \
15939
({ unsigned long _x; \
15940
- rc = ops->read_std((unsigned long)(_eip) + ctxt->cs_base, &_x, \
15941
- (_size), ctxt->vcpu); \
15943
+ rc = do_insn_fetch(ctxt, ops, (_eip), &_x, (_size)); \
15946
(_eip) += (_size); \
15948
@@ -431,26 +424,63 @@ struct operand {
15950
/* Access/update address held in a register, based on addressing mode. */
15951
#define address_mask(reg) \
15952
- ((ad_bytes == sizeof(unsigned long)) ? \
15953
- (reg) : ((reg) & ((1UL << (ad_bytes << 3)) - 1)))
15954
+ ((c->ad_bytes == sizeof(unsigned long)) ? \
15955
+ (reg) : ((reg) & ((1UL << (c->ad_bytes << 3)) - 1)))
15956
#define register_address(base, reg) \
15957
((base) + address_mask(reg))
15958
#define register_address_increment(reg, inc) \
15960
/* signed type ensures sign extension to long */ \
15961
int _inc = (inc); \
15962
- if ( ad_bytes == sizeof(unsigned long) ) \
15963
+ if (c->ad_bytes == sizeof(unsigned long)) \
15966
- (reg) = ((reg) & ~((1UL << (ad_bytes << 3)) - 1)) | \
15967
- (((reg) + _inc) & ((1UL << (ad_bytes << 3)) - 1)); \
15968
+ (reg) = ((reg) & \
15969
+ ~((1UL << (c->ad_bytes << 3)) - 1)) | \
15970
+ (((reg) + _inc) & \
15971
+ ((1UL << (c->ad_bytes << 3)) - 1)); \
15974
#define JMP_REL(rel) \
15976
- register_address_increment(_eip, rel); \
15977
+ register_address_increment(c->eip, rel); \
15980
+static int do_fetch_insn_byte(struct x86_emulate_ctxt *ctxt,
15981
+ struct x86_emulate_ops *ops,
15982
+ unsigned long linear, u8 *dest)
15984
+ struct fetch_cache *fc = &ctxt->decode.fetch;
15988
+ if (linear < fc->start || linear >= fc->end) {
15989
+ size = min(15UL, PAGE_SIZE - offset_in_page(linear));
15990
+ rc = ops->read_std(linear, fc->data, size, ctxt->vcpu);
15993
+ fc->start = linear;
15994
+ fc->end = linear + size;
15996
+ *dest = fc->data[linear - fc->start];
16000
+static int do_insn_fetch(struct x86_emulate_ctxt *ctxt,
16001
+ struct x86_emulate_ops *ops,
16002
+ unsigned long eip, void *dest, unsigned size)
16006
+ eip += ctxt->cs_base;
16008
+ rc = do_fetch_insn_byte(ctxt, ops, eip++, dest++);
16016
* Given the 'reg' portion of a ModRM byte, and a register block, return a
16017
* pointer into the block that addresses the relevant register.
16018
@@ -521,466 +551,888 @@ static int test_cc(unsigned int condition, unsigned int flags)
16019
return (!!rc ^ (condition & 1));
16022
+static void decode_register_operand(struct operand *op,
16023
+ struct decode_cache *c,
16024
+ int inhibit_bytereg)
16026
+ unsigned reg = c->modrm_reg;
16027
+ int highbyte_regs = c->rex_prefix == 0;
16029
+ if (!(c->d & ModRM))
16030
+ reg = (c->b & 7) | ((c->rex_prefix & 1) << 3);
16031
+ op->type = OP_REG;
16032
+ if ((c->d & ByteOp) && !inhibit_bytereg) {
16033
+ op->ptr = decode_register(reg, c->regs, highbyte_regs);
16034
+ op->val = *(u8 *)op->ptr;
16037
+ op->ptr = decode_register(reg, c->regs, 0);
16038
+ op->bytes = c->op_bytes;
16039
+ switch (op->bytes) {
16041
+ op->val = *(u16 *)op->ptr;
16044
+ op->val = *(u32 *)op->ptr;
16047
+ op->val = *(u64 *) op->ptr;
16051
+ op->orig_val = op->val;
16054
+static int decode_modrm(struct x86_emulate_ctxt *ctxt,
16055
+ struct x86_emulate_ops *ops)
16057
+ struct decode_cache *c = &ctxt->decode;
16059
+ int index_reg = 0, base_reg = 0, scale, rip_relative = 0;
16062
+ if (c->rex_prefix) {
16063
+ c->modrm_reg = (c->rex_prefix & 4) << 1; /* REX.R */
16064
+ index_reg = (c->rex_prefix & 2) << 2; /* REX.X */
16065
+ c->modrm_rm = base_reg = (c->rex_prefix & 1) << 3; /* REG.B */
16068
+ c->modrm = insn_fetch(u8, 1, c->eip);
16069
+ c->modrm_mod |= (c->modrm & 0xc0) >> 6;
16070
+ c->modrm_reg |= (c->modrm & 0x38) >> 3;
16071
+ c->modrm_rm |= (c->modrm & 0x07);
16073
+ c->use_modrm_ea = 1;
16075
+ if (c->modrm_mod == 3) {
16076
+ c->modrm_val = *(unsigned long *)
16077
+ decode_register(c->modrm_rm, c->regs, c->d & ByteOp);
16081
+ if (c->ad_bytes == 2) {
16082
+ unsigned bx = c->regs[VCPU_REGS_RBX];
16083
+ unsigned bp = c->regs[VCPU_REGS_RBP];
16084
+ unsigned si = c->regs[VCPU_REGS_RSI];
16085
+ unsigned di = c->regs[VCPU_REGS_RDI];
16087
+ /* 16-bit ModR/M decode. */
16088
+ switch (c->modrm_mod) {
16090
+ if (c->modrm_rm == 6)
16091
+ c->modrm_ea += insn_fetch(u16, 2, c->eip);
16094
+ c->modrm_ea += insn_fetch(s8, 1, c->eip);
16097
+ c->modrm_ea += insn_fetch(u16, 2, c->eip);
16100
+ switch (c->modrm_rm) {
16102
+ c->modrm_ea += bx + si;
16105
+ c->modrm_ea += bx + di;
16108
+ c->modrm_ea += bp + si;
16111
+ c->modrm_ea += bp + di;
16114
+ c->modrm_ea += si;
16117
+ c->modrm_ea += di;
16120
+ if (c->modrm_mod != 0)
16121
+ c->modrm_ea += bp;
16124
+ c->modrm_ea += bx;
16127
+ if (c->modrm_rm == 2 || c->modrm_rm == 3 ||
16128
+ (c->modrm_rm == 6 && c->modrm_mod != 0))
16129
+ if (!c->override_base)
16130
+ c->override_base = &ctxt->ss_base;
16131
+ c->modrm_ea = (u16)c->modrm_ea;
16133
+ /* 32/64-bit ModR/M decode. */
16134
+ switch (c->modrm_rm) {
16137
+ sib = insn_fetch(u8, 1, c->eip);
16138
+ index_reg |= (sib >> 3) & 7;
16139
+ base_reg |= sib & 7;
16140
+ scale = sib >> 6;
16142
+ switch (base_reg) {
16144
+ if (c->modrm_mod != 0)
16145
+ c->modrm_ea += c->regs[base_reg];
16148
+ insn_fetch(s32, 4, c->eip);
16151
+ c->modrm_ea += c->regs[base_reg];
16153
+ switch (index_reg) {
16157
+ c->modrm_ea += c->regs[index_reg] << scale;
16161
+ if (c->modrm_mod != 0)
16162
+ c->modrm_ea += c->regs[c->modrm_rm];
16163
+ else if (ctxt->mode == X86EMUL_MODE_PROT64)
16164
+ rip_relative = 1;
16167
+ c->modrm_ea += c->regs[c->modrm_rm];
16170
+ switch (c->modrm_mod) {
16172
+ if (c->modrm_rm == 5)
16173
+ c->modrm_ea += insn_fetch(s32, 4, c->eip);
16176
+ c->modrm_ea += insn_fetch(s8, 1, c->eip);
16179
+ c->modrm_ea += insn_fetch(s32, 4, c->eip);
16183
+ if (rip_relative) {
16184
+ c->modrm_ea += c->eip;
16185
+ switch (c->d & SrcMask) {
16187
+ c->modrm_ea += 1;
16190
+ if (c->d & ByteOp)
16191
+ c->modrm_ea += 1;
16193
+ if (c->op_bytes == 8)
16194
+ c->modrm_ea += 4;
16196
+ c->modrm_ea += c->op_bytes;
16203
+static int decode_abs(struct x86_emulate_ctxt *ctxt,
16204
+ struct x86_emulate_ops *ops)
16206
+ struct decode_cache *c = &ctxt->decode;
16209
+ switch (c->ad_bytes) {
16211
+ c->modrm_ea = insn_fetch(u16, 2, c->eip);
16214
+ c->modrm_ea = insn_fetch(u32, 4, c->eip);
16217
+ c->modrm_ea = insn_fetch(u64, 8, c->eip);
16225
-x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
16226
+x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
16229
- u8 b, sib, twobyte = 0, rex_prefix = 0;
16230
- u8 modrm, modrm_mod = 0, modrm_reg = 0, modrm_rm = 0;
16231
- unsigned long *override_base = NULL;
16232
- unsigned int op_bytes, ad_bytes, lock_prefix = 0, rep_prefix = 0, i;
16233
+ struct decode_cache *c = &ctxt->decode;
16235
- struct operand src, dst;
16236
- unsigned long cr2 = ctxt->cr2;
16237
int mode = ctxt->mode;
16238
- unsigned long modrm_ea;
16239
- int use_modrm_ea, index_reg = 0, base_reg = 0, scale, rip_relative = 0;
16242
+ int def_op_bytes, def_ad_bytes;
16244
/* Shadow copy of register state. Committed on successful emulation. */
16245
- unsigned long _regs[NR_VCPU_REGS];
16246
- unsigned long _eip = ctxt->vcpu->rip, _eflags = ctxt->eflags;
16247
- unsigned long modrm_val = 0;
16249
- memcpy(_regs, ctxt->vcpu->regs, sizeof _regs);
16250
+ memset(c, 0, sizeof(struct decode_cache));
16251
+ c->eip = ctxt->vcpu->rip;
16252
+ memcpy(c->regs, ctxt->vcpu->regs, sizeof c->regs);
16255
case X86EMUL_MODE_REAL:
16256
case X86EMUL_MODE_PROT16:
16257
- op_bytes = ad_bytes = 2;
16258
+ def_op_bytes = def_ad_bytes = 2;
16260
case X86EMUL_MODE_PROT32:
16261
- op_bytes = ad_bytes = 4;
16262
+ def_op_bytes = def_ad_bytes = 4;
16264
#ifdef CONFIG_X86_64
16265
case X86EMUL_MODE_PROT64:
16268
+ def_op_bytes = 4;
16269
+ def_ad_bytes = 8;
16276
+ c->op_bytes = def_op_bytes;
16277
+ c->ad_bytes = def_ad_bytes;
16279
/* Legacy prefixes. */
16280
- for (i = 0; i < 8; i++) {
16281
- switch (b = insn_fetch(u8, 1, _eip)) {
16283
+ switch (c->b = insn_fetch(u8, 1, c->eip)) {
16284
case 0x66: /* operand-size override */
16285
- op_bytes ^= 6; /* switch between 2/4 bytes */
16286
+ /* switch between 2/4 bytes */
16287
+ c->op_bytes = def_op_bytes ^ 6;
16289
case 0x67: /* address-size override */
16290
if (mode == X86EMUL_MODE_PROT64)
16291
- ad_bytes ^= 12; /* switch between 4/8 bytes */
16292
+ /* switch between 4/8 bytes */
16293
+ c->ad_bytes = def_ad_bytes ^ 12;
16295
- ad_bytes ^= 6; /* switch between 2/4 bytes */
16296
+ /* switch between 2/4 bytes */
16297
+ c->ad_bytes = def_ad_bytes ^ 6;
16299
case 0x2e: /* CS override */
16300
- override_base = &ctxt->cs_base;
16301
+ c->override_base = &ctxt->cs_base;
16303
case 0x3e: /* DS override */
16304
- override_base = &ctxt->ds_base;
16305
+ c->override_base = &ctxt->ds_base;
16307
case 0x26: /* ES override */
16308
- override_base = &ctxt->es_base;
16309
+ c->override_base = &ctxt->es_base;
16311
case 0x64: /* FS override */
16312
- override_base = &ctxt->fs_base;
16313
+ c->override_base = &ctxt->fs_base;
16315
case 0x65: /* GS override */
16316
- override_base = &ctxt->gs_base;
16317
+ c->override_base = &ctxt->gs_base;
16319
case 0x36: /* SS override */
16320
- override_base = &ctxt->ss_base;
16321
+ c->override_base = &ctxt->ss_base;
16323
+ case 0x40 ... 0x4f: /* REX */
16324
+ if (mode != X86EMUL_MODE_PROT64)
16325
+ goto done_prefixes;
16326
+ c->rex_prefix = c->b;
16328
case 0xf0: /* LOCK */
16330
+ c->lock_prefix = 1;
16332
case 0xf2: /* REPNE/REPNZ */
16333
+ c->rep_prefix = REPNE_PREFIX;
16335
case 0xf3: /* REP/REPE/REPZ */
16337
+ c->rep_prefix = REPE_PREFIX;
16340
goto done_prefixes;
16343
+ /* Any legacy prefix after a REX prefix nullifies its effect. */
16345
+ c->rex_prefix = 0;
16351
- if ((mode == X86EMUL_MODE_PROT64) && ((b & 0xf0) == 0x40)) {
16354
- op_bytes = 8; /* REX.W */
16355
- modrm_reg = (b & 4) << 1; /* REX.R */
16356
- index_reg = (b & 2) << 2; /* REX.X */
16357
- modrm_rm = base_reg = (b & 1) << 3; /* REG.B */
16358
- b = insn_fetch(u8, 1, _eip);
16360
+ if (c->rex_prefix)
16361
+ if (c->rex_prefix & 8)
16362
+ c->op_bytes = 8; /* REX.W */
16364
/* Opcode byte(s). */
16365
- d = opcode_table[b];
16367
+ c->d = opcode_table[c->b];
16369
/* Two-byte opcode? */
16372
- b = insn_fetch(u8, 1, _eip);
16373
- d = twobyte_table[b];
16374
+ if (c->b == 0x0f) {
16376
+ c->b = insn_fetch(u8, 1, c->eip);
16377
+ c->d = twobyte_table[c->b];
16380
/* Unrecognised? */
16382
- goto cannot_emulate;
16384
+ DPRINTF("Cannot emulate %02x\n", c->b);
16389
/* ModRM and SIB bytes. */
16391
- modrm = insn_fetch(u8, 1, _eip);
16392
- modrm_mod |= (modrm & 0xc0) >> 6;
16393
- modrm_reg |= (modrm & 0x38) >> 3;
16394
- modrm_rm |= (modrm & 0x07);
16396
- use_modrm_ea = 1;
16398
- if (modrm_mod == 3) {
16399
- modrm_val = *(unsigned long *)
16400
- decode_register(modrm_rm, _regs, d & ByteOp);
16403
+ if (c->d & ModRM)
16404
+ rc = decode_modrm(ctxt, ops);
16405
+ else if (c->d & MemAbs)
16406
+ rc = decode_abs(ctxt, ops);
16410
- if (ad_bytes == 2) {
16411
- unsigned bx = _regs[VCPU_REGS_RBX];
16412
- unsigned bp = _regs[VCPU_REGS_RBP];
16413
- unsigned si = _regs[VCPU_REGS_RSI];
16414
- unsigned di = _regs[VCPU_REGS_RDI];
16416
- /* 16-bit ModR/M decode. */
16417
- switch (modrm_mod) {
16419
- if (modrm_rm == 6)
16420
- modrm_ea += insn_fetch(u16, 2, _eip);
16423
- modrm_ea += insn_fetch(s8, 1, _eip);
16426
- modrm_ea += insn_fetch(u16, 2, _eip);
16429
- switch (modrm_rm) {
16431
- modrm_ea += bx + si;
16434
- modrm_ea += bx + di;
16437
- modrm_ea += bp + si;
16440
- modrm_ea += bp + di;
16449
- if (modrm_mod != 0)
16456
- if (modrm_rm == 2 || modrm_rm == 3 ||
16457
- (modrm_rm == 6 && modrm_mod != 0))
16458
- if (!override_base)
16459
- override_base = &ctxt->ss_base;
16460
- modrm_ea = (u16)modrm_ea;
16462
- /* 32/64-bit ModR/M decode. */
16463
- switch (modrm_rm) {
16466
- sib = insn_fetch(u8, 1, _eip);
16467
- index_reg |= (sib >> 3) & 7;
16468
- base_reg |= sib & 7;
16469
- scale = sib >> 6;
16471
- switch (base_reg) {
16473
- if (modrm_mod != 0)
16474
- modrm_ea += _regs[base_reg];
16476
- modrm_ea += insn_fetch(s32, 4, _eip);
16479
- modrm_ea += _regs[base_reg];
16481
- switch (index_reg) {
16485
- modrm_ea += _regs[index_reg] << scale;
16490
- if (modrm_mod != 0)
16491
- modrm_ea += _regs[modrm_rm];
16492
- else if (mode == X86EMUL_MODE_PROT64)
16493
- rip_relative = 1;
16496
- modrm_ea += _regs[modrm_rm];
16499
- switch (modrm_mod) {
16501
- if (modrm_rm == 5)
16502
- modrm_ea += insn_fetch(s32, 4, _eip);
16505
- modrm_ea += insn_fetch(s8, 1, _eip);
16508
- modrm_ea += insn_fetch(s32, 4, _eip);
16512
- if (!override_base)
16513
- override_base = &ctxt->ds_base;
16514
- if (mode == X86EMUL_MODE_PROT64 &&
16515
- override_base != &ctxt->fs_base &&
16516
- override_base != &ctxt->gs_base)
16517
- override_base = NULL;
16519
- if (override_base)
16520
- modrm_ea += *override_base;
16522
- if (rip_relative) {
16523
- modrm_ea += _eip;
16524
- switch (d & SrcMask) {
16532
- if (op_bytes == 8)
16535
- modrm_ea += op_bytes;
16538
- if (ad_bytes != 8)
16539
- modrm_ea = (u32)modrm_ea;
16544
+ if (!c->override_base)
16545
+ c->override_base = &ctxt->ds_base;
16546
+ if (mode == X86EMUL_MODE_PROT64 &&
16547
+ c->override_base != &ctxt->fs_base &&
16548
+ c->override_base != &ctxt->gs_base)
16549
+ c->override_base = NULL;
16551
+ if (c->override_base)
16552
+ c->modrm_ea += *c->override_base;
16554
+ if (c->ad_bytes != 8)
16555
+ c->modrm_ea = (u32)c->modrm_ea;
16557
* Decode and fetch the source operand: register, memory
16560
- switch (d & SrcMask) {
16561
+ switch (c->d & SrcMask) {
16565
- src.type = OP_REG;
16566
- if (d & ByteOp) {
16567
- src.ptr = decode_register(modrm_reg, _regs,
16568
- (rex_prefix == 0));
16569
- src.val = src.orig_val = *(u8 *) src.ptr;
16572
- src.ptr = decode_register(modrm_reg, _regs, 0);
16573
- switch ((src.bytes = op_bytes)) {
16575
- src.val = src.orig_val = *(u16 *) src.ptr;
16578
- src.val = src.orig_val = *(u32 *) src.ptr;
16581
- src.val = src.orig_val = *(u64 *) src.ptr;
16585
+ decode_register_operand(&c->src, c, 0);
16589
+ c->src.bytes = 2;
16590
goto srcmem_common;
16593
+ c->src.bytes = 4;
16594
goto srcmem_common;
16596
- src.bytes = (d & ByteOp) ? 1 : op_bytes;
16597
+ c->src.bytes = (c->d & ByteOp) ? 1 :
16599
/* Don't fetch the address for invlpg: it could be unmapped. */
16600
- if (twobyte && b == 0x01 && modrm_reg == 7)
16601
+ if (c->twobyte && c->b == 0x01
16602
+ && c->modrm_reg == 7)
16606
* For instructions with a ModR/M byte, switch to register
16607
* access if Mod = 3.
16609
- if ((d & ModRM) && modrm_mod == 3) {
16610
- src.type = OP_REG;
16611
+ if ((c->d & ModRM) && c->modrm_mod == 3) {
16612
+ c->src.type = OP_REG;
16615
- src.type = OP_MEM;
16616
- src.ptr = (unsigned long *)cr2;
16618
- if ((rc = ops->read_emulated((unsigned long)src.ptr,
16619
- &src.val, src.bytes, ctxt->vcpu)) != 0)
16621
- src.orig_val = src.val;
16622
+ c->src.type = OP_MEM;
16625
- src.type = OP_IMM;
16626
- src.ptr = (unsigned long *)_eip;
16627
- src.bytes = (d & ByteOp) ? 1 : op_bytes;
16628
- if (src.bytes == 8)
16630
+ c->src.type = OP_IMM;
16631
+ c->src.ptr = (unsigned long *)c->eip;
16632
+ c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
16633
+ if (c->src.bytes == 8)
16634
+ c->src.bytes = 4;
16635
/* NB. Immediates are sign-extended as necessary. */
16636
- switch (src.bytes) {
16637
+ switch (c->src.bytes) {
16639
- src.val = insn_fetch(s8, 1, _eip);
16640
+ c->src.val = insn_fetch(s8, 1, c->eip);
16643
- src.val = insn_fetch(s16, 2, _eip);
16644
+ c->src.val = insn_fetch(s16, 2, c->eip);
16647
- src.val = insn_fetch(s32, 4, _eip);
16648
+ c->src.val = insn_fetch(s32, 4, c->eip);
16653
- src.type = OP_IMM;
16654
- src.ptr = (unsigned long *)_eip;
16656
- src.val = insn_fetch(s8, 1, _eip);
16657
+ c->src.type = OP_IMM;
16658
+ c->src.ptr = (unsigned long *)c->eip;
16659
+ c->src.bytes = 1;
16660
+ c->src.val = insn_fetch(s8, 1, c->eip);
16664
/* Decode and fetch the destination operand: register or memory. */
16665
- switch (d & DstMask) {
16666
+ switch (c->d & DstMask) {
16668
/* Special instructions do their own operand decoding. */
16669
- goto special_insn;
16672
- dst.type = OP_REG;
16674
- && !(twobyte && (b == 0xb6 || b == 0xb7))) {
16675
- dst.ptr = decode_register(modrm_reg, _regs,
16676
- (rex_prefix == 0));
16677
- dst.val = *(u8 *) dst.ptr;
16680
- dst.ptr = decode_register(modrm_reg, _regs, 0);
16681
- switch ((dst.bytes = op_bytes)) {
16683
- dst.val = *(u16 *)dst.ptr;
16686
- dst.val = *(u32 *)dst.ptr;
16689
- dst.val = *(u64 *)dst.ptr;
16693
+ decode_register_operand(&c->dst, c,
16694
+ c->twobyte && (c->b == 0xb6 || c->b == 0xb7));
16697
- dst.type = OP_MEM;
16698
- dst.ptr = (unsigned long *)cr2;
16699
- dst.bytes = (d & ByteOp) ? 1 : op_bytes;
16702
* For instructions with a ModR/M byte, switch to register
16703
* access if Mod = 3.
16705
- if ((d & ModRM) && modrm_mod == 3) {
16706
- dst.type = OP_REG;
16707
+ if ((c->d & ModRM) && c->modrm_mod == 3)
16708
+ c->dst.type = OP_REG;
16710
+ c->dst.type = OP_MEM;
16715
+ return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
16718
+static inline void emulate_push(struct x86_emulate_ctxt *ctxt)
16720
+ struct decode_cache *c = &ctxt->decode;
16722
+ c->dst.type = OP_MEM;
16723
+ c->dst.bytes = c->op_bytes;
16724
+ c->dst.val = c->src.val;
16725
+ register_address_increment(c->regs[VCPU_REGS_RSP], -c->op_bytes);
16726
+ c->dst.ptr = (void *) register_address(ctxt->ss_base,
16727
+ c->regs[VCPU_REGS_RSP]);
16730
+static inline int emulate_grp1a(struct x86_emulate_ctxt *ctxt,
16731
+ struct x86_emulate_ops *ops)
16733
+ struct decode_cache *c = &ctxt->decode;
16736
+ /* 64-bit mode: POP always pops a 64-bit operand. */
16738
+ if (ctxt->mode == X86EMUL_MODE_PROT64)
16739
+ c->dst.bytes = 8;
16741
+ rc = ops->read_std(register_address(ctxt->ss_base,
16742
+ c->regs[VCPU_REGS_RSP]),
16743
+ &c->dst.val, c->dst.bytes, ctxt->vcpu);
16747
+ register_address_increment(c->regs[VCPU_REGS_RSP], c->dst.bytes);
16752
+static inline void emulate_grp2(struct x86_emulate_ctxt *ctxt)
16754
+ struct decode_cache *c = &ctxt->decode;
16755
+ switch (c->modrm_reg) {
16756
+ case 0: /* rol */
16757
+ emulate_2op_SrcB("rol", c->src, c->dst, ctxt->eflags);
16759
+ case 1: /* ror */
16760
+ emulate_2op_SrcB("ror", c->src, c->dst, ctxt->eflags);
16762
+ case 2: /* rcl */
16763
+ emulate_2op_SrcB("rcl", c->src, c->dst, ctxt->eflags);
16765
+ case 3: /* rcr */
16766
+ emulate_2op_SrcB("rcr", c->src, c->dst, ctxt->eflags);
16768
+ case 4: /* sal/shl */
16769
+ case 6: /* sal/shl */
16770
+ emulate_2op_SrcB("sal", c->src, c->dst, ctxt->eflags);
16772
+ case 5: /* shr */
16773
+ emulate_2op_SrcB("shr", c->src, c->dst, ctxt->eflags);
16775
+ case 7: /* sar */
16776
+ emulate_2op_SrcB("sar", c->src, c->dst, ctxt->eflags);
16781
+static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
16782
+ struct x86_emulate_ops *ops)
16784
+ struct decode_cache *c = &ctxt->decode;
16787
+ switch (c->modrm_reg) {
16788
+ case 0 ... 1: /* test */
16790
+ * Special case in Grp3: test has an immediate
16791
+ * source operand.
16793
+ c->src.type = OP_IMM;
16794
+ c->src.ptr = (unsigned long *)c->eip;
16795
+ c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
16796
+ if (c->src.bytes == 8)
16797
+ c->src.bytes = 4;
16798
+ switch (c->src.bytes) {
16800
+ c->src.val = insn_fetch(s8, 1, c->eip);
16803
+ c->src.val = insn_fetch(s16, 2, c->eip);
16806
+ c->src.val = insn_fetch(s32, 4, c->eip);
16810
- unsigned long mask = ~(dst.bytes * 8 - 1);
16811
+ emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
16813
+ case 2: /* not */
16814
+ c->dst.val = ~c->dst.val;
16816
+ case 3: /* neg */
16817
+ emulate_1op("neg", c->dst, ctxt->eflags);
16820
+ DPRINTF("Cannot emulate %02x\n", c->b);
16821
+ rc = X86EMUL_UNHANDLEABLE;
16828
- dst.ptr = (void *)dst.ptr + (src.val & mask) / 8;
16829
+static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
16830
+ struct x86_emulate_ops *ops)
16832
+ struct decode_cache *c = &ctxt->decode;
16835
+ switch (c->modrm_reg) {
16836
+ case 0: /* inc */
16837
+ emulate_1op("inc", c->dst, ctxt->eflags);
16839
+ case 1: /* dec */
16840
+ emulate_1op("dec", c->dst, ctxt->eflags);
16842
+ case 4: /* jmp abs */
16843
+ if (c->b == 0xff)
16844
+ c->eip = c->dst.val;
16846
+ DPRINTF("Cannot emulate %02x\n", c->b);
16847
+ return X86EMUL_UNHANDLEABLE;
16849
- if (!(d & Mov) && /* optimisation - avoid slow emulated read */
16850
- ((rc = ops->read_emulated((unsigned long)dst.ptr,
16851
- &dst.val, dst.bytes, ctxt->vcpu)) != 0))
16854
+ case 6: /* push */
16856
+ /* 64-bit mode: PUSH always pushes a 64-bit operand. */
16858
+ if (ctxt->mode == X86EMUL_MODE_PROT64) {
16859
+ c->dst.bytes = 8;
16860
+ rc = ops->read_std((unsigned long)c->dst.ptr,
16861
+ &c->dst.val, 8, ctxt->vcpu);
16865
+ register_address_increment(c->regs[VCPU_REGS_RSP],
16867
+ rc = ops->write_emulated(register_address(ctxt->ss_base,
16868
+ c->regs[VCPU_REGS_RSP]),
16870
+ c->dst.bytes, ctxt->vcpu);
16873
+ c->dst.type = OP_NONE;
16876
+ DPRINTF("Cannot emulate %02x\n", c->b);
16877
+ return X86EMUL_UNHANDLEABLE;
16882
+static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt,
16883
+ struct x86_emulate_ops *ops,
16884
+ unsigned long memop)
16886
+ struct decode_cache *c = &ctxt->decode;
16890
+ rc = ops->read_emulated(memop, &old, 8, ctxt->vcpu);
16894
+ if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) ||
16895
+ ((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) {
16897
+ c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
16898
+ c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
16899
+ ctxt->eflags &= ~EFLG_ZF;
16902
+ new = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
16903
+ (u32) c->regs[VCPU_REGS_RBX];
16905
+ rc = ops->cmpxchg_emulated(memop, &old, &new, 8, ctxt->vcpu);
16908
+ ctxt->eflags |= EFLG_ZF;
16913
+static inline int writeback(struct x86_emulate_ctxt *ctxt,
16914
+ struct x86_emulate_ops *ops)
16917
+ struct decode_cache *c = &ctxt->decode;
16919
+ switch (c->dst.type) {
16921
+ /* The 4-byte case *is* correct:
16922
+ * in 64-bit mode we zero-extend.
16924
+ switch (c->dst.bytes) {
16926
+ *(u8 *)c->dst.ptr = (u8)c->dst.val;
16929
+ *(u16 *)c->dst.ptr = (u16)c->dst.val;
16932
+ *c->dst.ptr = (u32)c->dst.val;
16933
+ break; /* 64b: zero-ext */
16935
+ *c->dst.ptr = c->dst.val;
16940
+ if (c->lock_prefix)
16941
+ rc = ops->cmpxchg_emulated(
16942
+ (unsigned long)c->dst.ptr,
16943
+ &c->dst.orig_val,
16948
+ rc = ops->write_emulated(
16949
+ (unsigned long)c->dst.ptr,
16957
+ /* no writeback */
16966
+x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
16968
+ unsigned long memop = 0;
16970
+ unsigned long saved_eip;
16971
+ struct decode_cache *c = &ctxt->decode;
16974
+ /* Shadow copy of register state. Committed on successful emulation.
16975
+ * NOTE: we can copy them from vcpu as x86_decode_insn() doesn't
16979
+ memcpy(c->regs, ctxt->vcpu->regs, sizeof c->regs);
16980
+ saved_eip = c->eip;
16982
+ if (((c->d & ModRM) && (c->modrm_mod != 3)) || (c->d & MemAbs))
16983
+ memop = c->modrm_ea;
16985
+ if (c->rep_prefix && (c->d & String)) {
16986
+ /* All REP prefixes have the same first termination condition */
16987
+ if (c->regs[VCPU_REGS_RCX] == 0) {
16988
+ ctxt->vcpu->rip = c->eip;
16991
+ /* The second termination condition only applies for REPE
16992
+ * and REPNE. Test if the repeat string operation prefix is
16993
+ * REPE/REPZ or REPNE/REPNZ and if it's the case it tests the
16994
+ * corresponding termination condition according to:
16995
+ * - if REPE/REPZ and ZF = 0 then done
16996
+ * - if REPNE/REPNZ and ZF = 1 then done
16998
+ if ((c->b == 0xa6) || (c->b == 0xa7) ||
16999
+ (c->b == 0xae) || (c->b == 0xaf)) {
17000
+ if ((c->rep_prefix == REPE_PREFIX) &&
17001
+ ((ctxt->eflags & EFLG_ZF) == 0)) {
17002
+ ctxt->vcpu->rip = c->eip;
17005
+ if ((c->rep_prefix == REPNE_PREFIX) &&
17006
+ ((ctxt->eflags & EFLG_ZF) == EFLG_ZF)) {
17007
+ ctxt->vcpu->rip = c->eip;
17011
+ c->regs[VCPU_REGS_RCX]--;
17012
+ c->eip = ctxt->vcpu->rip;
17014
- dst.orig_val = dst.val;
17017
+ if (c->src.type == OP_MEM) {
17018
+ c->src.ptr = (unsigned long *)memop;
17020
+ rc = ops->read_emulated((unsigned long)c->src.ptr,
17026
+ c->src.orig_val = c->src.val;
17029
+ if ((c->d & DstMask) == ImplicitOps)
17030
+ goto special_insn;
17033
+ if (c->dst.type == OP_MEM) {
17034
+ c->dst.ptr = (unsigned long *)memop;
17035
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17037
+ if (c->d & BitOp) {
17038
+ unsigned long mask = ~(c->dst.bytes * 8 - 1);
17040
+ c->dst.ptr = (void *)c->dst.ptr +
17041
+ (c->src.val & mask) / 8;
17043
+ if (!(c->d & Mov) &&
17044
+ /* optimisation - avoid slow emulated read */
17045
+ ((rc = ops->read_emulated((unsigned long)c->dst.ptr,
17047
+ c->dst.bytes, ctxt->vcpu)) != 0))
17050
+ c->dst.orig_val = c->dst.val;
17059
case 0x00 ... 0x05:
17061
- emulate_2op_SrcV("add", src, dst, _eflags);
17062
+ emulate_2op_SrcV("add", c->src, c->dst, ctxt->eflags);
17064
case 0x08 ... 0x0d:
17066
- emulate_2op_SrcV("or", src, dst, _eflags);
17067
+ emulate_2op_SrcV("or", c->src, c->dst, ctxt->eflags);
17069
case 0x10 ... 0x15:
17071
- emulate_2op_SrcV("adc", src, dst, _eflags);
17072
+ emulate_2op_SrcV("adc", c->src, c->dst, ctxt->eflags);
17074
case 0x18 ... 0x1d:
17076
- emulate_2op_SrcV("sbb", src, dst, _eflags);
17077
+ emulate_2op_SrcV("sbb", c->src, c->dst, ctxt->eflags);
17079
case 0x20 ... 0x23:
17081
- emulate_2op_SrcV("and", src, dst, _eflags);
17082
+ emulate_2op_SrcV("and", c->src, c->dst, ctxt->eflags);
17084
case 0x24: /* and al imm8 */
17085
- dst.type = OP_REG;
17086
- dst.ptr = &_regs[VCPU_REGS_RAX];
17087
- dst.val = *(u8 *)dst.ptr;
17089
- dst.orig_val = dst.val;
17090
+ c->dst.type = OP_REG;
17091
+ c->dst.ptr = &c->regs[VCPU_REGS_RAX];
17092
+ c->dst.val = *(u8 *)c->dst.ptr;
17093
+ c->dst.bytes = 1;
17094
+ c->dst.orig_val = c->dst.val;
17096
case 0x25: /* and ax imm16, or eax imm32 */
17097
- dst.type = OP_REG;
17098
- dst.bytes = op_bytes;
17099
- dst.ptr = &_regs[VCPU_REGS_RAX];
17100
- if (op_bytes == 2)
17101
- dst.val = *(u16 *)dst.ptr;
17102
+ c->dst.type = OP_REG;
17103
+ c->dst.bytes = c->op_bytes;
17104
+ c->dst.ptr = &c->regs[VCPU_REGS_RAX];
17105
+ if (c->op_bytes == 2)
17106
+ c->dst.val = *(u16 *)c->dst.ptr;
17108
- dst.val = *(u32 *)dst.ptr;
17109
- dst.orig_val = dst.val;
17110
+ c->dst.val = *(u32 *)c->dst.ptr;
17111
+ c->dst.orig_val = c->dst.val;
17113
case 0x28 ... 0x2d:
17115
- emulate_2op_SrcV("sub", src, dst, _eflags);
17116
+ emulate_2op_SrcV("sub", c->src, c->dst, ctxt->eflags);
17118
case 0x30 ... 0x35:
17120
- emulate_2op_SrcV("xor", src, dst, _eflags);
17121
+ emulate_2op_SrcV("xor", c->src, c->dst, ctxt->eflags);
17123
case 0x38 ... 0x3d:
17125
- emulate_2op_SrcV("cmp", src, dst, _eflags);
17126
+ emulate_2op_SrcV("cmp", c->src, c->dst, ctxt->eflags);
17128
+ case 0x40 ... 0x47: /* inc r16/r32 */
17129
+ emulate_1op("inc", c->dst, ctxt->eflags);
17131
+ case 0x48 ... 0x4f: /* dec r16/r32 */
17132
+ emulate_1op("dec", c->dst, ctxt->eflags);
17134
+ case 0x50 ... 0x57: /* push reg */
17135
+ c->dst.type = OP_MEM;
17136
+ c->dst.bytes = c->op_bytes;
17137
+ c->dst.val = c->src.val;
17138
+ register_address_increment(c->regs[VCPU_REGS_RSP],
17140
+ c->dst.ptr = (void *) register_address(
17141
+ ctxt->ss_base, c->regs[VCPU_REGS_RSP]);
17143
+ case 0x58 ... 0x5f: /* pop reg */
17145
+ if ((rc = ops->read_std(register_address(ctxt->ss_base,
17146
+ c->regs[VCPU_REGS_RSP]), c->dst.ptr,
17147
+ c->op_bytes, ctxt->vcpu)) != 0)
17150
+ register_address_increment(c->regs[VCPU_REGS_RSP],
17152
+ c->dst.type = OP_NONE; /* Disable writeback. */
17154
case 0x63: /* movsxd */
17155
- if (mode != X86EMUL_MODE_PROT64)
17156
+ if (ctxt->mode != X86EMUL_MODE_PROT64)
17157
goto cannot_emulate;
17158
- dst.val = (s32) src.val;
17159
+ c->dst.val = (s32) c->src.val;
17161
+ case 0x6a: /* push imm8 */
17163
+ c->src.val = insn_fetch(s8, 1, c->eip);
17164
+ emulate_push(ctxt);
17166
+ case 0x6c: /* insb */
17167
+ case 0x6d: /* insw/insd */
17168
+ if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
17170
+ (c->d & ByteOp) ? 1 : c->op_bytes,
17172
+ address_mask(c->regs[VCPU_REGS_RCX]) : 1,
17173
+ (ctxt->eflags & EFLG_DF),
17174
+ register_address(ctxt->es_base,
17175
+ c->regs[VCPU_REGS_RDI]),
17177
+ c->regs[VCPU_REGS_RDX]) == 0) {
17178
+ c->eip = saved_eip;
17182
+ case 0x6e: /* outsb */
17183
+ case 0x6f: /* outsw/outsd */
17184
+ if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
17186
+ (c->d & ByteOp) ? 1 : c->op_bytes,
17188
+ address_mask(c->regs[VCPU_REGS_RCX]) : 1,
17189
+ (ctxt->eflags & EFLG_DF),
17190
+ register_address(c->override_base ?
17191
+ *c->override_base :
17193
+ c->regs[VCPU_REGS_RSI]),
17195
+ c->regs[VCPU_REGS_RDX]) == 0) {
17196
+ c->eip = saved_eip;
17200
+ case 0x70 ... 0x7f: /* jcc (short) */ {
17201
+ int rel = insn_fetch(s8, 1, c->eip);
17203
+ if (test_cc(c->b, ctxt->eflags))
17207
case 0x80 ... 0x83: /* Grp1 */
17208
- switch (modrm_reg) {
17209
+ switch (c->modrm_reg) {
17213
@@ -1000,505 +1452,434 @@ done_prefixes:
17216
case 0x84 ... 0x85:
17218
- emulate_2op_SrcV("test", src, dst, _eflags);
17219
+ emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
17221
case 0x86 ... 0x87: /* xchg */
17222
/* Write back the register source. */
17223
- switch (dst.bytes) {
17224
+ switch (c->dst.bytes) {
17226
- *(u8 *) src.ptr = (u8) dst.val;
17227
+ *(u8 *) c->src.ptr = (u8) c->dst.val;
17230
- *(u16 *) src.ptr = (u16) dst.val;
17231
+ *(u16 *) c->src.ptr = (u16) c->dst.val;
17234
- *src.ptr = (u32) dst.val;
17235
+ *c->src.ptr = (u32) c->dst.val;
17236
break; /* 64b reg: zero-extend */
17238
- *src.ptr = dst.val;
17239
+ *c->src.ptr = c->dst.val;
17243
* Write back the memory destination with implicit LOCK
17246
- dst.val = src.val;
17248
+ c->dst.val = c->src.val;
17249
+ c->lock_prefix = 1;
17251
case 0x88 ... 0x8b: /* mov */
17253
case 0x8d: /* lea r16/r32, m */
17254
- dst.val = modrm_val;
17255
+ c->dst.val = c->modrm_val;
17257
case 0x8f: /* pop (sole member of Grp1a) */
17258
- /* 64-bit mode: POP always pops a 64-bit operand. */
17259
- if (mode == X86EMUL_MODE_PROT64)
17261
- if ((rc = ops->read_std(register_address(ctxt->ss_base,
17262
- _regs[VCPU_REGS_RSP]),
17263
- &dst.val, dst.bytes, ctxt->vcpu)) != 0)
17264
+ rc = emulate_grp1a(ctxt, ops);
17267
- register_address_increment(_regs[VCPU_REGS_RSP], dst.bytes);
17269
+ case 0x9c: /* pushf */
17270
+ c->src.val = (unsigned long) ctxt->eflags;
17271
+ emulate_push(ctxt);
17273
+ case 0x9d: /* popf */
17274
+ c->dst.ptr = (unsigned long *) &ctxt->eflags;
17275
+ goto pop_instruction;
17276
case 0xa0 ... 0xa1: /* mov */
17277
- dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
17278
- dst.val = src.val;
17279
- _eip += ad_bytes; /* skip src displacement */
17280
+ c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
17281
+ c->dst.val = c->src.val;
17283
case 0xa2 ... 0xa3: /* mov */
17284
- dst.val = (unsigned long)_regs[VCPU_REGS_RAX];
17285
- _eip += ad_bytes; /* skip dst displacement */
17287
- case 0xc0 ... 0xc1:
17289
- switch (modrm_reg) {
17290
- case 0: /* rol */
17291
- emulate_2op_SrcB("rol", src, dst, _eflags);
17293
- case 1: /* ror */
17294
- emulate_2op_SrcB("ror", src, dst, _eflags);
17296
- case 2: /* rcl */
17297
- emulate_2op_SrcB("rcl", src, dst, _eflags);
17299
- case 3: /* rcr */
17300
- emulate_2op_SrcB("rcr", src, dst, _eflags);
17302
- case 4: /* sal/shl */
17303
- case 6: /* sal/shl */
17304
- emulate_2op_SrcB("sal", src, dst, _eflags);
17306
- case 5: /* shr */
17307
- emulate_2op_SrcB("shr", src, dst, _eflags);
17309
- case 7: /* sar */
17310
- emulate_2op_SrcB("sar", src, dst, _eflags);
17314
- case 0xc6 ... 0xc7: /* mov (sole member of Grp11) */
17316
- dst.val = src.val;
17318
- case 0xd0 ... 0xd1: /* Grp2 */
17321
- case 0xd2 ... 0xd3: /* Grp2 */
17322
- src.val = _regs[VCPU_REGS_RCX];
17324
- case 0xf6 ... 0xf7: /* Grp3 */
17325
- switch (modrm_reg) {
17326
- case 0 ... 1: /* test */
17328
- * Special case in Grp3: test has an immediate
17329
- * source operand.
17331
- src.type = OP_IMM;
17332
- src.ptr = (unsigned long *)_eip;
17333
- src.bytes = (d & ByteOp) ? 1 : op_bytes;
17334
- if (src.bytes == 8)
17336
- switch (src.bytes) {
17338
- src.val = insn_fetch(s8, 1, _eip);
17341
- src.val = insn_fetch(s16, 2, _eip);
17344
- src.val = insn_fetch(s32, 4, _eip);
17348
- case 2: /* not */
17349
- dst.val = ~dst.val;
17351
- case 3: /* neg */
17352
- emulate_1op("neg", dst, _eflags);
17355
- goto cannot_emulate;
17357
+ c->dst.val = (unsigned long)c->regs[VCPU_REGS_RAX];
17359
- case 0xfe ... 0xff: /* Grp4/Grp5 */
17360
- switch (modrm_reg) {
17361
- case 0: /* inc */
17362
- emulate_1op("inc", dst, _eflags);
17364
- case 1: /* dec */
17365
- emulate_1op("dec", dst, _eflags);
17367
- case 4: /* jmp abs */
17371
- goto cannot_emulate;
17373
- case 6: /* push */
17374
- /* 64-bit mode: PUSH always pushes a 64-bit operand. */
17375
- if (mode == X86EMUL_MODE_PROT64) {
17377
- if ((rc = ops->read_std((unsigned long)dst.ptr,
17379
- ctxt->vcpu)) != 0)
17382
- register_address_increment(_regs[VCPU_REGS_RSP],
17384
- if ((rc = ops->write_emulated(
17385
- register_address(ctxt->ss_base,
17386
- _regs[VCPU_REGS_RSP]),
17387
- &dst.val, dst.bytes, ctxt->vcpu)) != 0)
17392
- goto cannot_emulate;
17396
+ case 0xa4 ... 0xa5: /* movs */
17397
+ c->dst.type = OP_MEM;
17398
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17399
+ c->dst.ptr = (unsigned long *)register_address(
17401
+ c->regs[VCPU_REGS_RDI]);
17402
+ if ((rc = ops->read_emulated(register_address(
17403
+ c->override_base ? *c->override_base :
17405
+ c->regs[VCPU_REGS_RSI]),
17407
+ c->dst.bytes, ctxt->vcpu)) != 0)
17412
- switch (dst.type) {
17414
- /* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
17415
- switch (dst.bytes) {
17417
- *(u8 *)dst.ptr = (u8)dst.val;
17420
- *(u16 *)dst.ptr = (u16)dst.val;
17423
- *dst.ptr = (u32)dst.val;
17424
- break; /* 64b: zero-ext */
17426
- *dst.ptr = dst.val;
17432
- rc = ops->cmpxchg_emulated((unsigned long)dst.
17433
- ptr, &dst.orig_val,
17434
- &dst.val, dst.bytes,
17437
- rc = ops->write_emulated((unsigned long)dst.ptr,
17438
- &dst.val, dst.bytes,
17446
+ register_address_increment(c->regs[VCPU_REGS_RSI],
17447
+ (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
17449
+ register_address_increment(c->regs[VCPU_REGS_RDI],
17450
+ (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
17453
+ case 0xa6 ... 0xa7: /* cmps */
17454
+ c->src.type = OP_NONE; /* Disable writeback. */
17455
+ c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17456
+ c->src.ptr = (unsigned long *)register_address(
17457
+ c->override_base ? *c->override_base :
17459
+ c->regs[VCPU_REGS_RSI]);
17460
+ if ((rc = ops->read_emulated((unsigned long)c->src.ptr,
17463
+ ctxt->vcpu)) != 0)
17466
- /* Commit shadow register state. */
17467
- memcpy(ctxt->vcpu->regs, _regs, sizeof _regs);
17468
- ctxt->eflags = _eflags;
17469
- ctxt->vcpu->rip = _eip;
17470
+ c->dst.type = OP_NONE; /* Disable writeback. */
17471
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17472
+ c->dst.ptr = (unsigned long *)register_address(
17474
+ c->regs[VCPU_REGS_RDI]);
17475
+ if ((rc = ops->read_emulated((unsigned long)c->dst.ptr,
17478
+ ctxt->vcpu)) != 0)
17482
- return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
17483
+ DPRINTF("cmps: mem1=0x%p mem2=0x%p\n", c->src.ptr, c->dst.ptr);
17487
- goto twobyte_special_insn;
17489
- case 0x50 ... 0x57: /* push reg */
17490
- if (op_bytes == 2)
17491
- src.val = (u16) _regs[b & 0x7];
17493
- src.val = (u32) _regs[b & 0x7];
17494
- dst.type = OP_MEM;
17495
- dst.bytes = op_bytes;
17496
- dst.val = src.val;
17497
- register_address_increment(_regs[VCPU_REGS_RSP], -op_bytes);
17498
- dst.ptr = (void *) register_address(
17499
- ctxt->ss_base, _regs[VCPU_REGS_RSP]);
17501
- case 0x58 ... 0x5f: /* pop reg */
17502
- dst.ptr = (unsigned long *)&_regs[b & 0x7];
17504
- if ((rc = ops->read_std(register_address(ctxt->ss_base,
17505
- _regs[VCPU_REGS_RSP]), dst.ptr, op_bytes, ctxt->vcpu))
17508
+ emulate_2op_SrcV("cmp", c->src, c->dst, ctxt->eflags);
17510
- register_address_increment(_regs[VCPU_REGS_RSP], op_bytes);
17511
- no_wb = 1; /* Disable writeback. */
17513
- case 0x6a: /* push imm8 */
17515
- src.val = insn_fetch(s8, 1, _eip);
17517
- dst.type = OP_MEM;
17518
- dst.bytes = op_bytes;
17519
- dst.val = src.val;
17520
- register_address_increment(_regs[VCPU_REGS_RSP], -op_bytes);
17521
- dst.ptr = (void *) register_address(ctxt->ss_base,
17522
- _regs[VCPU_REGS_RSP]);
17524
- case 0x6c: /* insb */
17525
- case 0x6d: /* insw/insd */
17526
- if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
17528
- (d & ByteOp) ? 1 : op_bytes, /* size */
17530
- address_mask(_regs[VCPU_REGS_RCX]) : 1, /* count */
17531
- (_eflags & EFLG_DF), /* down */
17532
- register_address(ctxt->es_base,
17533
- _regs[VCPU_REGS_RDI]), /* address */
17535
- _regs[VCPU_REGS_RDX] /* port */
17539
- case 0x6e: /* outsb */
17540
- case 0x6f: /* outsw/outsd */
17541
- if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
17543
- (d & ByteOp) ? 1 : op_bytes, /* size */
17545
- address_mask(_regs[VCPU_REGS_RCX]) : 1, /* count */
17546
- (_eflags & EFLG_DF), /* down */
17547
- register_address(override_base ?
17548
- *override_base : ctxt->ds_base,
17549
- _regs[VCPU_REGS_RSI]), /* address */
17551
- _regs[VCPU_REGS_RDX] /* port */
17555
- case 0x70 ... 0x7f: /* jcc (short) */ {
17556
- int rel = insn_fetch(s8, 1, _eip);
17557
+ register_address_increment(c->regs[VCPU_REGS_RSI],
17558
+ (ctxt->eflags & EFLG_DF) ? -c->src.bytes
17560
+ register_address_increment(c->regs[VCPU_REGS_RDI],
17561
+ (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
17564
- if (test_cc(b, _eflags))
17568
- case 0x9c: /* pushf */
17569
- src.val = (unsigned long) _eflags;
17571
- case 0x9d: /* popf */
17572
- dst.ptr = (unsigned long *) &_eflags;
17573
- goto pop_instruction;
17574
- case 0xc3: /* ret */
17576
- goto pop_instruction;
17577
- case 0xf4: /* hlt */
17578
- ctxt->vcpu->halt_request = 1;
17581
- if (rep_prefix) {
17582
- if (_regs[VCPU_REGS_RCX] == 0) {
17583
- ctxt->vcpu->rip = _eip;
17586
- _regs[VCPU_REGS_RCX]--;
17587
- _eip = ctxt->vcpu->rip;
17590
- case 0xa4 ... 0xa5: /* movs */
17591
- dst.type = OP_MEM;
17592
- dst.bytes = (d & ByteOp) ? 1 : op_bytes;
17593
- dst.ptr = (unsigned long *)register_address(ctxt->es_base,
17594
- _regs[VCPU_REGS_RDI]);
17595
- if ((rc = ops->read_emulated(register_address(
17596
- override_base ? *override_base : ctxt->ds_base,
17597
- _regs[VCPU_REGS_RSI]), &dst.val, dst.bytes, ctxt->vcpu)) != 0)
17599
- register_address_increment(_regs[VCPU_REGS_RSI],
17600
- (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
17601
- register_address_increment(_regs[VCPU_REGS_RDI],
17602
- (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
17604
- case 0xa6 ... 0xa7: /* cmps */
17605
- DPRINTF("Urk! I don't handle CMPS.\n");
17606
- goto cannot_emulate;
17607
case 0xaa ... 0xab: /* stos */
17608
- dst.type = OP_MEM;
17609
- dst.bytes = (d & ByteOp) ? 1 : op_bytes;
17610
- dst.ptr = (unsigned long *)cr2;
17611
- dst.val = _regs[VCPU_REGS_RAX];
17612
- register_address_increment(_regs[VCPU_REGS_RDI],
17613
- (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
17614
+ c->dst.type = OP_MEM;
17615
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17616
+ c->dst.ptr = (unsigned long *)register_address(
17618
+ c->regs[VCPU_REGS_RDI]);
17619
+ c->dst.val = c->regs[VCPU_REGS_RAX];
17620
+ register_address_increment(c->regs[VCPU_REGS_RDI],
17621
+ (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
17624
case 0xac ... 0xad: /* lods */
17625
- dst.type = OP_REG;
17626
- dst.bytes = (d & ByteOp) ? 1 : op_bytes;
17627
- dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
17628
- if ((rc = ops->read_emulated(cr2, &dst.val, dst.bytes,
17629
- ctxt->vcpu)) != 0)
17630
+ c->dst.type = OP_REG;
17631
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
17632
+ c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
17633
+ if ((rc = ops->read_emulated(register_address(
17634
+ c->override_base ? *c->override_base :
17636
+ c->regs[VCPU_REGS_RSI]),
17639
+ ctxt->vcpu)) != 0)
17641
- register_address_increment(_regs[VCPU_REGS_RSI],
17642
- (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
17644
+ register_address_increment(c->regs[VCPU_REGS_RSI],
17645
+ (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
17648
case 0xae ... 0xaf: /* scas */
17649
DPRINTF("Urk! I don't handle SCAS.\n");
17650
goto cannot_emulate;
17651
+ case 0xc0 ... 0xc1:
17652
+ emulate_grp2(ctxt);
17654
+ case 0xc3: /* ret */
17655
+ c->dst.ptr = &c->eip;
17656
+ goto pop_instruction;
17657
+ case 0xc6 ... 0xc7: /* mov (sole member of Grp11) */
17659
+ c->dst.val = c->src.val;
17661
+ case 0xd0 ... 0xd1: /* Grp2 */
17663
+ emulate_grp2(ctxt);
17665
+ case 0xd2 ... 0xd3: /* Grp2 */
17666
+ c->src.val = c->regs[VCPU_REGS_RCX];
17667
+ emulate_grp2(ctxt);
17669
case 0xe8: /* call (near) */ {
17671
- switch (op_bytes) {
17672
+ switch (c->op_bytes) {
17674
- rel = insn_fetch(s16, 2, _eip);
17675
+ rel = insn_fetch(s16, 2, c->eip);
17678
- rel = insn_fetch(s32, 4, _eip);
17681
- rel = insn_fetch(s64, 8, _eip);
17682
+ rel = insn_fetch(s32, 4, c->eip);
17685
DPRINTF("Call: Invalid op_bytes\n");
17686
goto cannot_emulate;
17688
- src.val = (unsigned long) _eip;
17689
+ c->src.val = (unsigned long) c->eip;
17691
- op_bytes = ad_bytes;
17694
+ * emulate_push() save value in size of c->op_bytes, therefore
17695
+ * we are setting it now to be the size of eip so all the value
17696
+ * of eip will be saved
17698
+ c->op_bytes = c->ad_bytes;
17699
+ emulate_push(ctxt);
17702
case 0xe9: /* jmp rel */
17703
case 0xeb: /* jmp rel short */
17704
- JMP_REL(src.val);
17705
- no_wb = 1; /* Disable writeback. */
17706
+ JMP_REL(c->src.val);
17707
+ c->dst.type = OP_NONE; /* Disable writeback. */
17709
+ case 0xf4: /* hlt */
17710
+ ctxt->vcpu->halt_request = 1;
17712
+ case 0xf5: /* cmc */
17713
+ /* complement carry flag from eflags reg */
17714
+ ctxt->eflags ^= EFLG_CF;
17715
+ c->dst.type = OP_NONE; /* Disable writeback. */
17717
+ case 0xf6 ... 0xf7: /* Grp3 */
17718
+ rc = emulate_grp3(ctxt, ops);
17722
+ case 0xf8: /* clc */
17723
+ ctxt->eflags &= ~EFLG_CF;
17724
+ c->dst.type = OP_NONE; /* Disable writeback. */
17726
+ case 0xfa: /* cli */
17727
+ ctxt->eflags &= ~X86_EFLAGS_IF;
17728
+ c->dst.type = OP_NONE; /* Disable writeback. */
17730
+ case 0xfb: /* sti */
17731
+ ctxt->eflags |= X86_EFLAGS_IF;
17732
+ c->dst.type = OP_NONE; /* Disable writeback. */
17734
+ case 0xfe ... 0xff: /* Grp4/Grp5 */
17735
+ rc = emulate_grp45(ctxt, ops);
17742
+ rc = writeback(ctxt, ops);
17746
+ /* Commit shadow register state. */
17747
+ memcpy(ctxt->vcpu->regs, c->regs, sizeof c->regs);
17748
+ ctxt->vcpu->rip = c->eip;
17751
+ if (rc == X86EMUL_UNHANDLEABLE) {
17752
+ c->eip = saved_eip;
17761
case 0x01: /* lgdt, lidt, lmsw */
17762
- /* Disable writeback. */
17764
- switch (modrm_reg) {
17765
+ switch (c->modrm_reg) {
17767
unsigned long address;
17769
- case 2: /* lgdt */
17770
- rc = read_descriptor(ctxt, ops, src.ptr,
17771
- &size, &address, op_bytes);
17772
+ case 0: /* vmcall */
17773
+ if (c->modrm_mod != 3 || c->modrm_rm != 1)
17774
+ goto cannot_emulate;
17776
+ rc = kvm_fix_hypercall(ctxt->vcpu);
17779
- realmode_lgdt(ctxt->vcpu, size, address);
17781
+ kvm_emulate_hypercall(ctxt->vcpu);
17783
- case 3: /* lidt */
17784
- rc = read_descriptor(ctxt, ops, src.ptr,
17785
- &size, &address, op_bytes);
17786
+ case 2: /* lgdt */
17787
+ rc = read_descriptor(ctxt, ops, c->src.ptr,
17788
+ &size, &address, c->op_bytes);
17791
- realmode_lidt(ctxt->vcpu, size, address);
17792
+ realmode_lgdt(ctxt->vcpu, size, address);
17794
+ case 3: /* lidt/vmmcall */
17795
+ if (c->modrm_mod == 3 && c->modrm_rm == 1) {
17796
+ rc = kvm_fix_hypercall(ctxt->vcpu);
17799
+ kvm_emulate_hypercall(ctxt->vcpu);
17801
+ rc = read_descriptor(ctxt, ops, c->src.ptr,
17806
+ realmode_lidt(ctxt->vcpu, size, address);
17810
- if (modrm_mod != 3)
17811
+ if (c->modrm_mod != 3)
17812
goto cannot_emulate;
17813
- *(u16 *)&_regs[modrm_rm]
17814
+ *(u16 *)&c->regs[c->modrm_rm]
17815
= realmode_get_cr(ctxt->vcpu, 0);
17818
- if (modrm_mod != 3)
17819
+ if (c->modrm_mod != 3)
17820
goto cannot_emulate;
17821
- realmode_lmsw(ctxt->vcpu, (u16)modrm_val, &_eflags);
17822
+ realmode_lmsw(ctxt->vcpu, (u16)c->modrm_val,
17825
case 7: /* invlpg*/
17826
- emulate_invlpg(ctxt->vcpu, cr2);
17827
+ emulate_invlpg(ctxt->vcpu, memop);
17830
goto cannot_emulate;
17832
+ /* Disable writeback. */
17833
+ c->dst.type = OP_NONE;
17836
+ emulate_clts(ctxt->vcpu);
17837
+ c->dst.type = OP_NONE;
17839
+ case 0x08: /* invd */
17840
+ case 0x09: /* wbinvd */
17841
+ case 0x0d: /* GrpP (prefetch) */
17842
+ case 0x18: /* Grp16 (prefetch/nop) */
17843
+ c->dst.type = OP_NONE;
17845
+ case 0x20: /* mov cr, reg */
17846
+ if (c->modrm_mod != 3)
17847
+ goto cannot_emulate;
17848
+ c->regs[c->modrm_rm] =
17849
+ realmode_get_cr(ctxt->vcpu, c->modrm_reg);
17850
+ c->dst.type = OP_NONE; /* no writeback */
17852
case 0x21: /* mov from dr to reg */
17854
- if (modrm_mod != 3)
17855
+ if (c->modrm_mod != 3)
17856
+ goto cannot_emulate;
17857
+ rc = emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]);
17859
+ goto cannot_emulate;
17860
+ c->dst.type = OP_NONE; /* no writeback */
17862
+ case 0x22: /* mov reg, cr */
17863
+ if (c->modrm_mod != 3)
17864
goto cannot_emulate;
17865
- rc = emulator_get_dr(ctxt, modrm_reg, &_regs[modrm_rm]);
17866
+ realmode_set_cr(ctxt->vcpu,
17867
+ c->modrm_reg, c->modrm_val, &ctxt->eflags);
17868
+ c->dst.type = OP_NONE;
17870
case 0x23: /* mov from reg to dr */
17872
- if (modrm_mod != 3)
17873
+ if (c->modrm_mod != 3)
17874
+ goto cannot_emulate;
17875
+ rc = emulator_set_dr(ctxt, c->modrm_reg,
17876
+ c->regs[c->modrm_rm]);
17878
goto cannot_emulate;
17879
- rc = emulator_set_dr(ctxt, modrm_reg, _regs[modrm_rm]);
17880
+ c->dst.type = OP_NONE; /* no writeback */
17884
+ msr_data = (u32)c->regs[VCPU_REGS_RAX]
17885
+ | ((u64)c->regs[VCPU_REGS_RDX] << 32);
17886
+ rc = kvm_set_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], msr_data);
17888
+ kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
17889
+ c->eip = ctxt->vcpu->rip;
17891
+ rc = X86EMUL_CONTINUE;
17892
+ c->dst.type = OP_NONE;
17896
+ rc = kvm_get_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], &msr_data);
17898
+ kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
17899
+ c->eip = ctxt->vcpu->rip;
17901
+ c->regs[VCPU_REGS_RAX] = (u32)msr_data;
17902
+ c->regs[VCPU_REGS_RDX] = msr_data >> 32;
17904
+ rc = X86EMUL_CONTINUE;
17905
+ c->dst.type = OP_NONE;
17907
case 0x40 ... 0x4f: /* cmov */
17908
- dst.val = dst.orig_val = src.val;
17911
- * First, assume we're decoding an even cmov opcode
17914
- switch ((b & 15) >> 1) {
17915
- case 0: /* cmovo */
17916
- no_wb = (_eflags & EFLG_OF) ? 0 : 1;
17918
- case 1: /* cmovb/cmovc/cmovnae */
17919
- no_wb = (_eflags & EFLG_CF) ? 0 : 1;
17921
- case 2: /* cmovz/cmove */
17922
- no_wb = (_eflags & EFLG_ZF) ? 0 : 1;
17924
- case 3: /* cmovbe/cmovna */
17925
- no_wb = (_eflags & (EFLG_CF | EFLG_ZF)) ? 0 : 1;
17927
- case 4: /* cmovs */
17928
- no_wb = (_eflags & EFLG_SF) ? 0 : 1;
17929
+ c->dst.val = c->dst.orig_val = c->src.val;
17930
+ if (!test_cc(c->b, ctxt->eflags))
17931
+ c->dst.type = OP_NONE; /* no writeback */
17933
+ case 0x80 ... 0x8f: /* jnz rel, etc*/ {
17936
+ switch (c->op_bytes) {
17938
+ rel = insn_fetch(s16, 2, c->eip);
17940
- case 5: /* cmovp/cmovpe */
17941
- no_wb = (_eflags & EFLG_PF) ? 0 : 1;
17943
+ rel = insn_fetch(s32, 4, c->eip);
17945
- case 7: /* cmovle/cmovng */
17946
- no_wb = (_eflags & EFLG_ZF) ? 0 : 1;
17947
- /* fall through */
17948
- case 6: /* cmovl/cmovnge */
17949
- no_wb &= (!(_eflags & EFLG_SF) !=
17950
- !(_eflags & EFLG_OF)) ? 0 : 1;
17952
+ rel = insn_fetch(s64, 8, c->eip);
17955
+ DPRINTF("jnz: Invalid op_bytes\n");
17956
+ goto cannot_emulate;
17958
- /* Odd cmov opcodes (lsb == 1) have inverted sense. */
17960
+ if (test_cc(c->b, ctxt->eflags))
17962
+ c->dst.type = OP_NONE;
17967
- src.val &= (dst.bytes << 3) - 1; /* only subword offset */
17968
- emulate_2op_SrcV_nobyte("bt", src, dst, _eflags);
17969
+ c->dst.type = OP_NONE;
17970
+ /* only subword offset */
17971
+ c->src.val &= (c->dst.bytes << 3) - 1;
17972
+ emulate_2op_SrcV_nobyte("bt", c->src, c->dst, ctxt->eflags);
17976
- src.val &= (dst.bytes << 3) - 1; /* only subword offset */
17977
- emulate_2op_SrcV_nobyte("bts", src, dst, _eflags);
17978
+ /* only subword offset */
17979
+ c->src.val &= (c->dst.bytes << 3) - 1;
17980
+ emulate_2op_SrcV_nobyte("bts", c->src, c->dst, ctxt->eflags);
17982
case 0xb0 ... 0xb1: /* cmpxchg */
17984
* Save real source value, then compare EAX against
17987
- src.orig_val = src.val;
17988
- src.val = _regs[VCPU_REGS_RAX];
17989
- emulate_2op_SrcV("cmp", src, dst, _eflags);
17990
- if (_eflags & EFLG_ZF) {
17991
+ c->src.orig_val = c->src.val;
17992
+ c->src.val = c->regs[VCPU_REGS_RAX];
17993
+ emulate_2op_SrcV("cmp", c->src, c->dst, ctxt->eflags);
17994
+ if (ctxt->eflags & EFLG_ZF) {
17995
/* Success: write back to memory. */
17996
- dst.val = src.orig_val;
17997
+ c->dst.val = c->src.orig_val;
17999
/* Failure: write the value we saw to EAX. */
18000
- dst.type = OP_REG;
18001
- dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
18002
+ c->dst.type = OP_REG;
18003
+ c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
18008
- src.val &= (dst.bytes << 3) - 1; /* only subword offset */
18009
- emulate_2op_SrcV_nobyte("btr", src, dst, _eflags);
18010
+ /* only subword offset */
18011
+ c->src.val &= (c->dst.bytes << 3) - 1;
18012
+ emulate_2op_SrcV_nobyte("btr", c->src, c->dst, ctxt->eflags);
18014
case 0xb6 ... 0xb7: /* movzx */
18015
- dst.bytes = op_bytes;
18016
- dst.val = (d & ByteOp) ? (u8) src.val : (u16) src.val;
18017
+ c->dst.bytes = c->op_bytes;
18018
+ c->dst.val = (c->d & ByteOp) ? (u8) c->src.val
18019
+ : (u16) c->src.val;
18021
case 0xba: /* Grp8 */
18022
- switch (modrm_reg & 3) {
18023
+ switch (c->modrm_reg & 3) {
18027
@@ -1511,152 +1892,31 @@ twobyte_insn:
18031
- src.val &= (dst.bytes << 3) - 1; /* only subword offset */
18032
- emulate_2op_SrcV_nobyte("btc", src, dst, _eflags);
18033
+ /* only subword offset */
18034
+ c->src.val &= (c->dst.bytes << 3) - 1;
18035
+ emulate_2op_SrcV_nobyte("btc", c->src, c->dst, ctxt->eflags);
18037
case 0xbe ... 0xbf: /* movsx */
18038
- dst.bytes = op_bytes;
18039
- dst.val = (d & ByteOp) ? (s8) src.val : (s16) src.val;
18040
+ c->dst.bytes = c->op_bytes;
18041
+ c->dst.val = (c->d & ByteOp) ? (s8) c->src.val :
18042
+ (s16) c->src.val;
18044
case 0xc3: /* movnti */
18045
- dst.bytes = op_bytes;
18046
- dst.val = (op_bytes == 4) ? (u32) src.val : (u64) src.val;
18051
-twobyte_special_insn:
18052
- /* Disable writeback. */
18056
- emulate_clts(ctxt->vcpu);
18058
- case 0x08: /* invd */
18060
- case 0x09: /* wbinvd */
18062
- case 0x0d: /* GrpP (prefetch) */
18063
- case 0x18: /* Grp16 (prefetch/nop) */
18065
- case 0x20: /* mov cr, reg */
18066
- if (modrm_mod != 3)
18067
- goto cannot_emulate;
18068
- _regs[modrm_rm] = realmode_get_cr(ctxt->vcpu, modrm_reg);
18070
- case 0x22: /* mov reg, cr */
18071
- if (modrm_mod != 3)
18072
- goto cannot_emulate;
18073
- realmode_set_cr(ctxt->vcpu, modrm_reg, modrm_val, &_eflags);
18077
- msr_data = (u32)_regs[VCPU_REGS_RAX]
18078
- | ((u64)_regs[VCPU_REGS_RDX] << 32);
18079
- rc = kvm_set_msr(ctxt->vcpu, _regs[VCPU_REGS_RCX], msr_data);
18081
- kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
18082
- _eip = ctxt->vcpu->rip;
18084
- rc = X86EMUL_CONTINUE;
18088
- rc = kvm_get_msr(ctxt->vcpu, _regs[VCPU_REGS_RCX], &msr_data);
18090
- kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
18091
- _eip = ctxt->vcpu->rip;
18093
- _regs[VCPU_REGS_RAX] = (u32)msr_data;
18094
- _regs[VCPU_REGS_RDX] = msr_data >> 32;
18096
- rc = X86EMUL_CONTINUE;
18098
- case 0x80 ... 0x8f: /* jnz rel, etc*/ {
18101
- switch (op_bytes) {
18103
- rel = insn_fetch(s16, 2, _eip);
18106
- rel = insn_fetch(s32, 4, _eip);
18109
- rel = insn_fetch(s64, 8, _eip);
18112
- DPRINTF("jnz: Invalid op_bytes\n");
18113
- goto cannot_emulate;
18115
- if (test_cc(b, _eflags))
18117
+ c->dst.bytes = c->op_bytes;
18118
+ c->dst.val = (c->op_bytes == 4) ? (u32) c->src.val :
18119
+ (u64) c->src.val;
18122
case 0xc7: /* Grp9 (cmpxchg8b) */
18125
- if ((rc = ops->read_emulated(cr2, &old, 8, ctxt->vcpu))
18128
- if (((u32) (old >> 0) != (u32) _regs[VCPU_REGS_RAX]) ||
18129
- ((u32) (old >> 32) != (u32) _regs[VCPU_REGS_RDX])) {
18130
- _regs[VCPU_REGS_RAX] = (u32) (old >> 0);
18131
- _regs[VCPU_REGS_RDX] = (u32) (old >> 32);
18132
- _eflags &= ~EFLG_ZF;
18134
- new = ((u64)_regs[VCPU_REGS_RCX] << 32)
18135
- | (u32) _regs[VCPU_REGS_RBX];
18136
- if ((rc = ops->cmpxchg_emulated(cr2, &old,
18137
- &new, 8, ctxt->vcpu)) != 0)
18139
- _eflags |= EFLG_ZF;
18143
+ rc = emulate_grp9(ctxt, ops, memop);
18146
+ c->dst.type = OP_NONE;
18152
- DPRINTF("Cannot emulate %02x\n", b);
18153
+ DPRINTF("Cannot emulate %02x\n", c->b);
18154
+ c->eip = saved_eip;
18160
-#include <asm/mm.h>
18161
-#include <asm/uaccess.h>
18164
-x86_emulate_read_std(unsigned long addr,
18165
- unsigned long *val,
18166
- unsigned int bytes, struct x86_emulate_ctxt *ctxt)
18172
- if ((rc = copy_from_user((void *)val, (void *)addr, bytes)) != 0) {
18173
- propagate_page_fault(addr + bytes - rc, 0); /* read fault */
18174
- return X86EMUL_PROPAGATE_FAULT;
18177
- return X86EMUL_CONTINUE;
18181
-x86_emulate_write_std(unsigned long addr,
18182
- unsigned long val,
18183
- unsigned int bytes, struct x86_emulate_ctxt *ctxt)
18187
- if ((rc = copy_to_user((void *)addr, (void *)&val, bytes)) != 0) {
18188
- propagate_page_fault(addr + bytes - rc, PGERR_write_access);
18189
- return X86EMUL_PROPAGATE_FAULT;
18192
- return X86EMUL_CONTINUE;
18196
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
18197
index 92c73aa..7db91b9 100644
18198
--- a/drivers/kvm/x86_emulate.h
18199
+++ b/drivers/kvm/x86_emulate.h
18200
@@ -63,17 +63,6 @@ struct x86_emulate_ops {
18201
unsigned int bytes, struct kvm_vcpu *vcpu);
18204
- * write_std: Write bytes of standard (non-emulated/special) memory.
18205
- * Used for stack operations, and others.
18206
- * @addr: [IN ] Linear address to which to write.
18207
- * @val: [IN ] Value to write to memory (low-order bytes used as
18209
- * @bytes: [IN ] Number of bytes to write to memory.
18211
- int (*write_std)(unsigned long addr, const void *val,
18212
- unsigned int bytes, struct kvm_vcpu *vcpu);
18215
* read_emulated: Read bytes from emulated/special memory area.
18216
* @addr: [IN ] Linear address from which to read.
18217
* @val: [OUT] Value read from memory, zero-extended to 'u_long'.
18218
@@ -112,13 +101,50 @@ struct x86_emulate_ops {
18222
+/* Type, address-of, and value of an instruction's operand. */
18224
+ enum { OP_REG, OP_MEM, OP_IMM, OP_NONE } type;
18225
+ unsigned int bytes;
18226
+ unsigned long val, orig_val, *ptr;
18229
+struct fetch_cache {
18231
+ unsigned long start;
18232
+ unsigned long end;
18235
+struct decode_cache {
18243
+ struct operand src;
18244
+ struct operand dst;
18245
+ unsigned long *override_base;
18247
+ unsigned long regs[NR_VCPU_REGS];
18248
+ unsigned long eip;
18255
+ unsigned long modrm_ea;
18256
+ unsigned long modrm_val;
18257
+ struct fetch_cache fetch;
18260
struct x86_emulate_ctxt {
18261
/* Register state before/after emulation. */
18262
struct kvm_vcpu *vcpu;
18264
/* Linear faulting address (if emulating a page-faulting instruction). */
18265
unsigned long eflags;
18266
- unsigned long cr2;
18268
/* Emulated execution mode, represented by an X86EMUL_MODE value. */
18270
@@ -129,8 +155,16 @@ struct x86_emulate_ctxt {
18271
unsigned long ss_base;
18272
unsigned long gs_base;
18273
unsigned long fs_base;
18275
+ /* decode cache */
18277
+ struct decode_cache decode;
18280
+/* Repeat String Operation Prefix */
18281
+#define REPE_PREFIX 1
18282
+#define REPNE_PREFIX 2
18284
/* Execution mode, passed to the emulator. */
18285
#define X86EMUL_MODE_REAL 0 /* Real mode. */
18286
#define X86EMUL_MODE_PROT16 2 /* 16-bit protected mode. */
18287
@@ -144,12 +178,9 @@ struct x86_emulate_ctxt {
18288
#define X86EMUL_MODE_HOST X86EMUL_MODE_PROT64
18292
- * x86_emulate_memop: Emulate an instruction that faulted attempting to
18293
- * read/write a 'special' memory area.
18294
- * Returns -1 on failure, 0 on success.
18296
-int x86_emulate_memop(struct x86_emulate_ctxt *ctxt,
18297
- struct x86_emulate_ops *ops);
18298
+int x86_decode_insn(struct x86_emulate_ctxt *ctxt,
18299
+ struct x86_emulate_ops *ops);
18300
+int x86_emulate_insn(struct x86_emulate_ctxt *ctxt,
18301
+ struct x86_emulate_ops *ops);
18303
#endif /* __X86_EMULATE_H__ */
18304
diff --git a/drivers/mfd/sm501.c b/drivers/mfd/sm501.c
18305
index afd8296..8135e4c 100644
18306
--- a/drivers/mfd/sm501.c
18307
+++ b/drivers/mfd/sm501.c
18308
@@ -156,7 +156,7 @@ static void sm501_dump_clk(struct sm501_devdata *sm)
18310
dev_dbg(sm->dev, "PM0[%c]: "
18311
"P2 %ld.%ld MHz (%ld), V2 %ld.%ld (%ld), "
18312
- "M %ld.%ld (%ld), MX1 %ld.%ld (%ld)\n",
18313
+x "M %ld.%ld (%ld), MX1 %ld.%ld (%ld)\n",
18314
(pmc & 3 ) == 0 ? '*' : '-',
18315
fmt_freq(decode_div(pll2, pm0, 24, 1<<29, 31, px_div)),
18316
fmt_freq(decode_div(pll2, pm0, 16, 1<<20, 15, misc_div)),
18317
diff --git a/drivers/misc/sony-laptop.c b/drivers/misc/sony-laptop.c
18318
index b0f6803..bb13858 100644
18319
--- a/drivers/misc/sony-laptop.c
18320
+++ b/drivers/misc/sony-laptop.c
18321
@@ -338,7 +338,7 @@ static void sony_laptop_report_input_event(u8 event)
18322
dprintk("unknown input event %.2x\n", event);
18325
-static int sony_laptop_setup_input(struct acpi_device *acpi_device)
18326
+static int sony_laptop_setup_input(void)
18328
struct input_dev *jog_dev;
18329
struct input_dev *key_dev;
18330
@@ -379,7 +379,6 @@ static int sony_laptop_setup_input(struct acpi_device *acpi_device)
18331
key_dev->name = "Sony Vaio Keys";
18332
key_dev->id.bustype = BUS_ISA;
18333
key_dev->id.vendor = PCI_VENDOR_ID_SONY;
18334
- key_dev->dev.parent = &acpi_device->dev;
18336
/* Initialize the Input Drivers: special keys */
18337
set_bit(EV_KEY, key_dev->evbit);
18338
@@ -411,7 +410,6 @@ static int sony_laptop_setup_input(struct acpi_device *acpi_device)
18339
jog_dev->name = "Sony Vaio Jogdial";
18340
jog_dev->id.bustype = BUS_ISA;
18341
jog_dev->id.vendor = PCI_VENDOR_ID_SONY;
18342
- key_dev->dev.parent = &acpi_device->dev;
18344
jog_dev->evbit[0] = BIT_MASK(EV_KEY) | BIT_MASK(EV_REL);
18345
jog_dev->keybit[BIT_WORD(BTN_MOUSE)] = BIT_MASK(BTN_MIDDLE);
18346
@@ -1008,7 +1006,7 @@ static int sony_nc_add(struct acpi_device *device)
18349
/* setup input devices and helper fifo */
18350
- result = sony_laptop_setup_input(device);
18351
+ result = sony_laptop_setup_input();
18353
printk(KERN_ERR DRV_PFX
18354
"Unabe to create input devices.\n");
18355
@@ -1036,7 +1034,7 @@ static int sony_nc_add(struct acpi_device *device)
18356
sony_backlight_device->props.brightness =
18357
sony_backlight_get_brightness
18358
(sony_backlight_device);
18359
- sony_backlight_device->props.max_brightness =
18360
+ sony_backlight_device->props.max_brightness =
18361
SONY_MAX_BRIGHTNESS - 1;
18364
@@ -2455,7 +2453,7 @@ static int sony_pic_add(struct acpi_device *device)
18367
/* setup input devices and helper fifo */
18368
- result = sony_laptop_setup_input(device);
18369
+ result = sony_laptop_setup_input();
18371
printk(KERN_ERR DRV_PFX
18372
"Unabe to create input devices.\n");
18373
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
18374
index 30cd13b..1b9c9b6 100644
18375
--- a/drivers/mmc/card/queue.c
18376
+++ b/drivers/mmc/card/queue.c
18377
@@ -180,13 +180,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
18378
blk_queue_max_hw_segments(mq->queue, host->max_hw_segs);
18379
blk_queue_max_segment_size(mq->queue, host->max_seg_size);
18381
- mq->sg = kmalloc(sizeof(struct scatterlist) *
18382
+ mq->sg = kzalloc(sizeof(struct scatterlist) *
18383
host->max_phys_segs, GFP_KERNEL);
18386
goto cleanup_queue;
18388
- sg_init_table(mq->sg, host->max_phys_segs);
18391
init_MUTEX(&mq->thread_sem);
18392
diff --git a/drivers/mmc/card/sdio_uart.c b/drivers/mmc/card/sdio_uart.c
18393
index eeea84c..d552de6 100644
18394
--- a/drivers/mmc/card/sdio_uart.c
18395
+++ b/drivers/mmc/card/sdio_uart.c
18396
@@ -386,7 +386,7 @@ static void sdio_uart_stop_rx(struct sdio_uart_port *port)
18397
sdio_out(port, UART_IER, port->ier);
18400
-static void sdio_uart_receive_chars(struct sdio_uart_port *port, unsigned int *status)
18401
+static void sdio_uart_receive_chars(struct sdio_uart_port *port, int *status)
18403
struct tty_struct *tty = port->tty;
18404
unsigned int ch, flag;
18405
diff --git a/drivers/net/mlx4/qp.c b/drivers/net/mlx4/qp.c
18406
index fa24e65..42b4763 100644
18407
--- a/drivers/net/mlx4/qp.c
18408
+++ b/drivers/net/mlx4/qp.c
18409
@@ -113,7 +113,7 @@ int mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
18410
struct mlx4_cmd_mailbox *mailbox;
18413
- if (cur_state >= MLX4_QP_NUM_STATE || new_state >= MLX4_QP_NUM_STATE ||
18414
+ if (cur_state >= MLX4_QP_NUM_STATE || cur_state >= MLX4_QP_NUM_STATE ||
18415
!op[cur_state][new_state])
18418
diff --git a/drivers/pci/hotplug/acpiphp.h b/drivers/pci/hotplug/acpiphp.h
18419
index 1ef417c..f6cc0c5 100644
18420
--- a/drivers/pci/hotplug/acpiphp.h
18421
+++ b/drivers/pci/hotplug/acpiphp.h
18422
@@ -66,7 +66,7 @@ struct slot {
18423
char name[SLOT_NAME_SIZE];
18428
* struct acpiphp_bridge - PCI bridge information
18430
* for each bridge device in ACPI namespace
18431
@@ -97,7 +97,7 @@ struct acpiphp_bridge {
18437
* struct acpiphp_slot - PCI slot information
18439
* PCI slot information for each *physical* PCI slot
18440
@@ -118,7 +118,7 @@ struct acpiphp_slot {
18446
* struct acpiphp_func - PCI function information
18448
* PCI function information for each object in ACPI namespace
18449
@@ -137,7 +137,7 @@ struct acpiphp_func {
18450
u32 flags; /* see below */
18455
* struct acpiphp_attention_info - device specific attention registration
18457
* ACPI has no generic method of setting/getting attention status
18458
diff --git a/drivers/pci/hotplug/acpiphp_core.c b/drivers/pci/hotplug/acpiphp_core.c
18459
index c8c2638..a0ca63a 100644
18460
--- a/drivers/pci/hotplug/acpiphp_core.c
18461
+++ b/drivers/pci/hotplug/acpiphp_core.c
18462
@@ -91,10 +91,10 @@ static struct hotplug_slot_ops acpi_hotplug_slot_ops = {
18463
* acpiphp_register_attention - set attention LED callback
18464
* @info: must be completely filled with LED callbacks
18466
- * Description: This is used to register a hardware specific ACPI
18467
+ * Description: this is used to register a hardware specific ACPI
18468
* driver that manipulates the attention LED. All the fields in
18469
* info must be set.
18472
int acpiphp_register_attention(struct acpiphp_attention_info *info)
18474
int retval = -EINVAL;
18475
@@ -112,10 +112,10 @@ int acpiphp_register_attention(struct acpiphp_attention_info *info)
18476
* acpiphp_unregister_attention - unset attention LED callback
18477
* @info: must match the pointer used to register
18479
- * Description: This is used to un-register a hardware specific acpi
18480
+ * Description: this is used to un-register a hardware specific acpi
18481
* driver that manipulates the attention LED. The pointer to the
18482
* info struct must be the same as the one used to set it.
18485
int acpiphp_unregister_attention(struct acpiphp_attention_info *info)
18487
int retval = -EINVAL;
18488
@@ -133,6 +133,7 @@ int acpiphp_unregister_attention(struct acpiphp_attention_info *info)
18489
* @hotplug_slot: slot to enable
18491
* Actual tasks are done in acpiphp_enable_slot()
18494
static int enable_slot(struct hotplug_slot *hotplug_slot)
18496
@@ -150,6 +151,7 @@ static int enable_slot(struct hotplug_slot *hotplug_slot)
18497
* @hotplug_slot: slot to disable
18499
* Actual tasks are done in acpiphp_disable_slot()
18502
static int disable_slot(struct hotplug_slot *hotplug_slot)
18504
@@ -166,15 +168,15 @@ static int disable_slot(struct hotplug_slot *hotplug_slot)
18509
- * set_attention_status - set attention LED
18511
+ * set_attention_status - set attention LED
18512
* @hotplug_slot: slot to set attention LED on
18513
* @status: value to set attention LED to (0 or 1)
18515
* attention status LED, so we use a callback that
18516
* was registered with us. This allows hardware specific
18517
* ACPI implementations to blink the light for us.
18520
static int set_attention_status(struct hotplug_slot *hotplug_slot, u8 status)
18522
int retval = -ENODEV;
18523
@@ -197,6 +199,7 @@ static int disable_slot(struct hotplug_slot *hotplug_slot)
18525
* Some platforms may not implement _STA method properly.
18526
* In that case, the value returned may not be reliable.
18529
static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value)
18531
@@ -210,7 +213,7 @@ static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value)
18537
* get_attention_status - get attention LED status
18538
* @hotplug_slot: slot to get status from
18539
* @value: returns with value of attention LED
18540
@@ -218,8 +221,8 @@ static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value)
18541
* ACPI doesn't have known method to determine the state
18542
* of the attention status LED, so we use a callback that
18543
* was registered with us. This allows hardware specific
18544
- * ACPI implementations to determine its state.
18546
+ * ACPI implementations to determine its state
18548
static int get_attention_status(struct hotplug_slot *hotplug_slot, u8 *value)
18550
int retval = -EINVAL;
18551
@@ -241,7 +244,8 @@ static int get_attention_status(struct hotplug_slot *hotplug_slot, u8 *value)
18552
* @value: pointer to store status
18554
* ACPI doesn't provide any formal means to access latch status.
18555
- * Instead, we fake latch status from _STA.
18556
+ * Instead, we fake latch status from _STA
18559
static int get_latch_status(struct hotplug_slot *hotplug_slot, u8 *value)
18561
@@ -261,7 +265,8 @@ static int get_latch_status(struct hotplug_slot *hotplug_slot, u8 *value)
18562
* @value: pointer to store status
18564
* ACPI doesn't provide any formal means to access adapter status.
18565
- * Instead, we fake adapter status from _STA.
18566
+ * Instead, we fake adapter status from _STA
18569
static int get_adapter_status(struct hotplug_slot *hotplug_slot, u8 *value)
18571
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
18572
index ff1b1c7..1e125b5 100644
18573
--- a/drivers/pci/hotplug/acpiphp_glue.c
18574
+++ b/drivers/pci/hotplug/acpiphp_glue.c
18575
@@ -82,6 +82,7 @@ static void handle_hotplug_event_func(acpi_handle handle, u32 type, void *contex
18576
* 2. has _PS0 method
18577
* 3. has _PS3 method
18581
static int is_ejectable(acpi_handle handle)
18583
@@ -985,8 +986,10 @@ static int power_off_slot(struct acpiphp_slot *slot)
18587
- * acpiphp_max_busnr - return the highest reserved bus number under the given bus.
18588
+ * acpiphp_max_busnr - return the highest reserved bus number under
18590
* @bus: bus to start search with
18593
static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
18595
@@ -1015,6 +1018,7 @@ static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
18597
* acpiphp_bus_add - add a new bus to acpi subsystem
18598
* @func: acpiphp_func of the bridge
18601
static int acpiphp_bus_add(struct acpiphp_func *func)
18603
@@ -1059,6 +1063,7 @@ acpiphp_bus_add_out:
18605
* acpiphp_bus_trim - trim a bus from acpi subsystem
18606
* @handle: handle to acpi namespace
18609
static int acpiphp_bus_trim(acpi_handle handle)
18611
@@ -1084,6 +1089,7 @@ static int acpiphp_bus_trim(acpi_handle handle)
18613
* This function should be called per *physical slot*,
18614
* not per each slot object in ACPI namespace.
18617
static int enable_device(struct acpiphp_slot *slot)
18619
@@ -1179,7 +1185,6 @@ static void disable_bridges(struct pci_bus *bus)
18622
* disable_device - disable a slot
18623
- * @slot: ACPI PHP slot
18625
static int disable_device(struct acpiphp_slot *slot)
18627
@@ -1235,15 +1240,14 @@ static int disable_device(struct acpiphp_slot *slot)
18630
* get_slot_status - get ACPI slot status
18631
- * @slot: ACPI PHP slot
18633
- * If a slot has _STA for each function and if any one of them
18634
- * returned non-zero status, return it.
18635
+ * if a slot has _STA for each function and if any one of them
18636
+ * returned non-zero status, return it
18638
- * If a slot doesn't have _STA and if any one of its functions'
18639
- * configuration space is configured, return 0x0f as a _STA.
18640
+ * if a slot doesn't have _STA and if any one of its functions'
18641
+ * configuration space is configured, return 0x0f as a _STA
18643
- * Otherwise return 0.
18644
+ * otherwise return 0
18646
static unsigned int get_slot_status(struct acpiphp_slot *slot)
18648
@@ -1277,7 +1281,6 @@ static unsigned int get_slot_status(struct acpiphp_slot *slot)
18651
* acpiphp_eject_slot - physically eject the slot
18652
- * @slot: ACPI PHP slot
18654
int acpiphp_eject_slot(struct acpiphp_slot *slot)
18656
@@ -1311,7 +1314,6 @@ int acpiphp_eject_slot(struct acpiphp_slot *slot)
18659
* acpiphp_check_bridge - re-enumerate devices
18660
- * @bridge: where to begin re-enumeration
18662
* Iterate over all slots under this bridge and make sure that if a
18663
* card is present they are enabled, and if not they are disabled.
18664
@@ -1536,11 +1538,13 @@ check_sub_bridges(acpi_handle handle, u32 lvl, void *context, void **rv)
18667
* handle_hotplug_event_bridge - handle ACPI event on bridges
18669
* @handle: Notify()'ed acpi_handle
18670
* @type: Notify code
18671
* @context: pointer to acpiphp_bridge structure
18673
- * Handles ACPI event notification on {host,p2p} bridges.
18674
+ * handles ACPI event notification on {host,p2p} bridges
18677
static void handle_hotplug_event_bridge(acpi_handle handle, u32 type, void *context)
18679
@@ -1630,11 +1634,13 @@ static void handle_hotplug_event_bridge(acpi_handle handle, u32 type, void *cont
18682
* handle_hotplug_event_func - handle ACPI event on functions (i.e. slots)
18684
* @handle: Notify()'ed acpi_handle
18685
* @type: Notify code
18686
* @context: pointer to acpiphp_func structure
18688
- * Handles ACPI event notification on slots.
18689
+ * handles ACPI event notification on slots
18692
static void handle_hotplug_event_func(acpi_handle handle, u32 type, void *context)
18694
@@ -1699,6 +1705,7 @@ static struct acpi_pci_driver acpi_pci_hp_driver = {
18697
* acpiphp_glue_init - initializes all PCI hotplug - ACPI glue data structures
18700
int __init acpiphp_glue_init(void)
18702
@@ -1719,7 +1726,7 @@ int __init acpiphp_glue_init(void)
18704
* acpiphp_glue_exit - terminates all PCI hotplug - ACPI glue data structures
18706
- * This function frees all data allocated in acpiphp_glue_init().
18707
+ * This function frees all data allocated in acpiphp_glue_init()
18709
void acpiphp_glue_exit(void)
18711
@@ -1753,6 +1760,7 @@ int __init acpiphp_get_num_slots(void)
18712
* acpiphp_for_each_slot - call function for each slot
18713
* @fn: callback function
18714
* @data: context to be passed to callback function
18717
static int acpiphp_for_each_slot(acpiphp_callback fn, void *data)
18719
@@ -1778,7 +1786,6 @@ static int acpiphp_for_each_slot(acpiphp_callback fn, void *data)
18722
* acpiphp_enable_slot - power on slot
18723
- * @slot: ACPI PHP slot
18725
int acpiphp_enable_slot(struct acpiphp_slot *slot)
18727
@@ -1808,7 +1815,6 @@ int acpiphp_enable_slot(struct acpiphp_slot *slot)
18730
* acpiphp_disable_slot - power off slot
18731
- * @slot: ACPI PHP slot
18733
int acpiphp_disable_slot(struct acpiphp_slot *slot)
18735
diff --git a/drivers/pci/hotplug/acpiphp_ibm.c b/drivers/pci/hotplug/acpiphp_ibm.c
18736
index 47d26b6..56829f8 100644
18737
--- a/drivers/pci/hotplug/acpiphp_ibm.c
18738
+++ b/drivers/pci/hotplug/acpiphp_ibm.c
18739
@@ -134,11 +134,11 @@ static struct acpiphp_attention_info ibm_attention_info =
18740
* ibm_slot_from_id - workaround for bad ibm hardware
18741
* @id: the slot number that linux refers to the slot by
18743
- * Description: This method returns the aCPI slot descriptor
18744
+ * Description: this method returns the aCPI slot descriptor
18745
* corresponding to the Linux slot number. This descriptor
18746
* has info about the aPCI slot id and attention status.
18747
* This descriptor must be freed using kfree when done.
18750
static union apci_descriptor *ibm_slot_from_id(int id)
18753
@@ -173,9 +173,9 @@ ibm_slot_done:
18754
* @slot: the hotplug_slot to work with
18755
* @status: what to set the LED to (0 or 1)
18757
- * Description: This method is registered with the acpiphp module as a
18758
- * callback to do the device specific task of setting the LED status.
18760
+ * Description: this method is registered with the acpiphp module as a
18761
+ * callback to do the device specific task of setting the LED status
18763
static int ibm_set_attention_status(struct hotplug_slot *slot, u8 status)
18765
union acpi_object args[2];
18766
@@ -213,13 +213,13 @@ static int ibm_set_attention_status(struct hotplug_slot *slot, u8 status)
18767
* @slot: the hotplug_slot to work with
18768
* @status: returns what the LED is set to (0 or 1)
18770
- * Description: This method is registered with the acpiphp module as a
18771
- * callback to do the device specific task of getting the LED status.
18772
+ * Description: this method is registered with the acpiphp module as a
18773
+ * callback to do the device specific task of getting the LED status
18775
* Because there is no direct method of getting the LED status directly
18776
* from an ACPI call, we read the aPCI table and parse out our
18777
* slot descriptor to read the status from that.
18780
static int ibm_get_attention_status(struct hotplug_slot *slot, u8 *status)
18782
union apci_descriptor *ibm_slot;
18783
@@ -245,8 +245,8 @@ static int ibm_get_attention_status(struct hotplug_slot *slot, u8 *status)
18784
* @event: the event info (device specific)
18785
* @context: passed context (our notification struct)
18787
- * Description: This method is registered as a callback with the ACPI
18788
- * subsystem it is called when this device has an event to notify the OS of.
18789
+ * Description: this method is registered as a callback with the ACPI
18790
+ * subsystem it is called when this device has an event to notify the OS of
18792
* The events actually come from the device as two events that get
18793
* synthesized into one event with data by this function. The event
18794
@@ -256,7 +256,7 @@ static int ibm_get_attention_status(struct hotplug_slot *slot, u8 *status)
18795
* From section 5.6.2.2 of the ACPI 2.0 spec, I understand that the OSPM will
18796
* only re-enable the interrupt that causes this event AFTER this method
18797
* has returned, thereby enforcing serial access for the notification struct.
18800
static void ibm_handle_events(acpi_handle handle, u32 event, void *context)
18802
u8 detail = event & 0x0f;
18803
@@ -279,16 +279,16 @@ static void ibm_handle_events(acpi_handle handle, u32 event, void *context)
18804
* ibm_get_table_from_acpi - reads the APLS buffer from ACPI
18805
* @bufp: address to pointer to allocate for the table
18807
- * Description: This method reads the APLS buffer in from ACPI and
18808
+ * Description: this method reads the APLS buffer in from ACPI and
18809
* stores the "stripped" table into a single buffer
18810
- * it allocates and passes the address back in bufp.
18811
+ * it allocates and passes the address back in bufp
18813
* If NULL is passed in as buffer, this method only calculates
18814
* the size of the table and returns that without filling
18818
- * Returns < 0 on error or the size of the table on success.
18820
+ * returns < 0 on error or the size of the table on success
18822
static int ibm_get_table_from_acpi(char **bufp)
18824
union acpi_object *package;
18825
@@ -349,18 +349,17 @@ read_table_done:
18827
* ibm_read_apci_table - callback for the sysfs apci_table file
18828
* @kobj: the kobject this binary attribute is a part of
18829
- * @bin_attr: struct bin_attribute for this file
18830
* @buffer: the kernel space buffer to fill
18831
* @pos: the offset into the file
18832
* @size: the number of bytes requested
18834
- * Description: Gets registered with sysfs as the reader callback
18835
- * to be executed when /sys/bus/pci/slots/apci_table gets read.
18836
+ * Description: gets registered with sysfs as the reader callback
18837
+ * to be executed when /sys/bus/pci/slots/apci_table gets read
18839
* Since we don't get notified on open and close for this file,
18840
* things get really tricky here...
18841
- * our solution is to only allow reading the table in all at once.
18843
+ * our solution is to only allow reading the table in all at once
18845
static ssize_t ibm_read_apci_table(struct kobject *kobj,
18846
struct bin_attribute *bin_attr,
18847
char *buffer, loff_t pos, size_t size)
18848
@@ -386,10 +385,10 @@ static ssize_t ibm_read_apci_table(struct kobject *kobj,
18849
* @context: a pointer to our handle to fill when we find the device
18850
* @rv: a return value to fill if desired
18852
- * Description: Used as a callback when calling acpi_walk_namespace
18853
+ * Description: used as a callback when calling acpi_walk_namespace
18854
* to find our device. When this method returns non-zero
18855
- * acpi_walk_namespace quits its search and returns our value.
18857
+ * acpi_walk_namespace quits its search and returns our value
18859
static acpi_status __init ibm_find_acpi_device(acpi_handle handle,
18860
u32 lvl, void *context, void **rv)
18862
diff --git a/drivers/pci/hotplug/cpqphp_core.c b/drivers/pci/hotplug/cpqphp_core.c
18863
index 7417887..a96b739 100644
18864
--- a/drivers/pci/hotplug/cpqphp_core.c
18865
+++ b/drivers/pci/hotplug/cpqphp_core.c
18866
@@ -117,10 +117,12 @@ static inline int is_slot66mhz(struct slot *slot)
18869
* detect_SMBIOS_pointer - find the System Management BIOS Table in mem region.
18871
* @begin: begin pointer for region to be scanned.
18872
* @end: end pointer for region to be scanned.
18874
- * Returns pointer to the head of the SMBIOS tables (or %NULL).
18875
+ * Returns pointer to the head of the SMBIOS tables (or NULL)
18878
static void __iomem * detect_SMBIOS_pointer(void __iomem *begin, void __iomem *end)
18880
@@ -155,9 +157,9 @@ static void __iomem * detect_SMBIOS_pointer(void __iomem *begin, void __iomem *e
18883
* init_SERR - Initializes the per slot SERR generation.
18884
- * @ctrl: controller to use
18886
* For unexpected switch opens
18889
static int init_SERR(struct controller * ctrl)
18891
@@ -222,15 +224,14 @@ static int pci_print_IRQ_route (void)
18894
* get_subsequent_smbios_entry: get the next entry from bios table.
18895
- * @smbios_start: where to start in the SMBIOS table
18896
- * @smbios_table: location of the SMBIOS table
18897
- * @curr: %NULL or pointer to previously returned structure
18899
- * Gets the first entry if previous == NULL;
18900
- * otherwise, returns the next entry.
18901
- * Uses global SMBIOS Table pointer.
18902
+ * Gets the first entry if previous == NULL
18903
+ * Otherwise, returns the next entry
18904
+ * Uses global SMBIOS Table pointer
18906
- * Returns a pointer to an SMBIOS structure or NULL if none found.
18907
+ * @curr: %NULL or pointer to previously returned structure
18909
+ * returns a pointer to an SMBIOS structure or NULL if none found
18911
static void __iomem *get_subsequent_smbios_entry(void __iomem *smbios_start,
18912
void __iomem *smbios_table,
18913
@@ -271,18 +272,17 @@ static void __iomem *get_subsequent_smbios_entry(void __iomem *smbios_start,
18917
- * get_SMBIOS_entry - return the requested SMBIOS entry or %NULL
18918
- * @smbios_start: where to start in the SMBIOS table
18919
- * @smbios_table: location of the SMBIOS table
18920
- * @type: SMBIOS structure type to be returned
18921
+ * get_SMBIOS_entry
18923
+ * @type:SMBIOS structure type to be returned
18924
* @previous: %NULL or pointer to previously returned structure
18926
- * Gets the first entry of the specified type if previous == %NULL;
18927
+ * Gets the first entry of the specified type if previous == NULL
18928
* Otherwise, returns the next entry of the given type.
18929
- * Uses global SMBIOS Table pointer.
18930
- * Uses get_subsequent_smbios_entry.
18931
+ * Uses global SMBIOS Table pointer
18932
+ * Uses get_subsequent_smbios_entry
18934
- * Returns a pointer to an SMBIOS structure or %NULL if none found.
18935
+ * returns a pointer to an SMBIOS structure or %NULL if none found
18937
static void __iomem *get_SMBIOS_entry(void __iomem *smbios_start,
18938
void __iomem *smbios_table,
18939
@@ -581,9 +581,7 @@ get_slot_mapping(struct pci_bus *bus, u8 bus_num, u8 dev_num, u8 *slot)
18942
* cpqhp_set_attention_status - Turns the Amber LED for a slot on or off
18943
- * @ctrl: struct controller to use
18944
- * @func: PCI device/function info
18945
- * @status: LED control flag: 1 = LED on, 0 = LED off
18949
cpqhp_set_attention_status(struct controller *ctrl, struct pci_func *func,
18950
@@ -623,8 +621,7 @@ cpqhp_set_attention_status(struct controller *ctrl, struct pci_func *func,
18953
* set_attention_status - Turns the Amber LED for a slot on or off
18954
- * @hotplug_slot: slot to change LED on
18955
- * @status: LED control flag
18958
static int set_attention_status (struct hotplug_slot *hotplug_slot, u8 status)
18960
diff --git a/drivers/pci/hotplug/cpqphp_ctrl.c b/drivers/pci/hotplug/cpqphp_ctrl.c
18961
index 4018420..856d57b 100644
18962
--- a/drivers/pci/hotplug/cpqphp_ctrl.c
18963
+++ b/drivers/pci/hotplug/cpqphp_ctrl.c
18964
@@ -123,7 +123,7 @@ static u8 handle_switch_change(u8 change, struct controller * ctrl)
18968
- * cpqhp_find_slot - find the struct slot of given device
18969
+ * cpqhp_find_slot: find the struct slot of given device
18970
* @ctrl: scan lots of this controller
18971
* @device: the device id to find
18973
@@ -305,8 +305,9 @@ static u8 handle_power_fault(u8 change, struct controller * ctrl)
18977
- * sort_by_size - sort nodes on the list by their length, smallest first.
18978
+ * sort_by_size: sort nodes on the list by their length, smallest first.
18979
* @head: list to sort
18982
static int sort_by_size(struct pci_resource **head)
18984
@@ -353,8 +354,9 @@ static int sort_by_size(struct pci_resource **head)
18988
- * sort_by_max_size - sort nodes on the list by their length, largest first.
18989
+ * sort_by_max_size: sort nodes on the list by their length, largest first.
18990
* @head: list to sort
18993
static int sort_by_max_size(struct pci_resource **head)
18995
@@ -401,10 +403,8 @@ static int sort_by_max_size(struct pci_resource **head)
18999
- * do_pre_bridge_resource_split - find node of resources that are unused
19000
- * @head: new list head
19001
- * @orig_head: original list head
19002
- * @alignment: max node size (?)
19003
+ * do_pre_bridge_resource_split: find node of resources that are unused
19006
static struct pci_resource *do_pre_bridge_resource_split(struct pci_resource **head,
19007
struct pci_resource **orig_head, u32 alignment)
19008
@@ -477,9 +477,8 @@ static struct pci_resource *do_pre_bridge_resource_split(struct pci_resource **h
19012
- * do_bridge_resource_split - find one node of resources that aren't in use
19013
- * @head: list head
19014
- * @alignment: max node size (?)
19015
+ * do_bridge_resource_split: find one node of resources that aren't in use
19018
static struct pci_resource *do_bridge_resource_split(struct pci_resource **head, u32 alignment)
19020
@@ -526,13 +525,14 @@ error:
19024
- * get_io_resource - find first node of given size not in ISA aliasing window.
19025
+ * get_io_resource: find first node of given size not in ISA aliasing window.
19026
* @head: list to search
19027
* @size: size of node to find, must be a power of two.
19029
- * Description: This function sorts the resource list by size and then returns
19030
+ * Description: this function sorts the resource list by size and then returns
19031
* returns the first node of "size" length that is not in the ISA aliasing
19032
* window. If it finds a node larger than "size" it will split it up.
19035
static struct pci_resource *get_io_resource(struct pci_resource **head, u32 size)
19037
@@ -620,7 +620,7 @@ static struct pci_resource *get_io_resource(struct pci_resource **head, u32 size
19041
- * get_max_resource - get largest node which has at least the given size.
19042
+ * get_max_resource: get largest node which has at least the given size.
19043
* @head: the list to search the node in
19044
* @size: the minimum size of the node to find
19046
@@ -712,7 +712,7 @@ static struct pci_resource *get_max_resource(struct pci_resource **head, u32 siz
19050
- * get_resource - find resource of given size and split up larger ones.
19051
+ * get_resource: find resource of given size and split up larger ones.
19052
* @head: the list to search for resources
19053
* @size: the size limit to use
19055
@@ -804,14 +804,14 @@ static struct pci_resource *get_resource(struct pci_resource **head, u32 size)
19059
- * cpqhp_resource_sort_and_combine - sort nodes by base addresses and clean up
19060
+ * cpqhp_resource_sort_and_combine: sort nodes by base addresses and clean up.
19061
* @head: the list to sort and clean up
19063
* Description: Sorts all of the nodes in the list in ascending order by
19064
* their base addresses. Also does garbage collection by
19065
* combining adjacent nodes.
19067
- * Returns %0 if success.
19068
+ * returns 0 if success
19070
int cpqhp_resource_sort_and_combine(struct pci_resource **head)
19072
@@ -951,9 +951,9 @@ irqreturn_t cpqhp_ctrl_intr(int IRQ, void *data)
19075
* cpqhp_slot_create - Creates a node and adds it to the proper bus.
19076
- * @busnumber: bus where new node is to be located
19077
+ * @busnumber - bus where new node is to be located
19079
- * Returns pointer to the new node or %NULL if unsuccessful.
19080
+ * Returns pointer to the new node or NULL if unsuccessful
19082
struct pci_func *cpqhp_slot_create(u8 busnumber)
19084
@@ -986,7 +986,7 @@ struct pci_func *cpqhp_slot_create(u8 busnumber)
19085
* slot_remove - Removes a node from the linked list of slots.
19086
* @old_slot: slot to remove
19088
- * Returns %0 if successful, !0 otherwise.
19089
+ * Returns 0 if successful, !0 otherwise.
19091
static int slot_remove(struct pci_func * old_slot)
19093
@@ -1026,7 +1026,7 @@ static int slot_remove(struct pci_func * old_slot)
19094
* bridge_slot_remove - Removes a node from the linked list of slots.
19095
* @bridge: bridge to remove
19097
- * Returns %0 if successful, !0 otherwise.
19098
+ * Returns 0 if successful, !0 otherwise.
19100
static int bridge_slot_remove(struct pci_func *bridge)
19102
@@ -1071,7 +1071,7 @@ out:
19103
* cpqhp_slot_find - Looks for a node by bus, and device, multiple functions accessed
19104
* @bus: bus to find
19105
* @device: device to find
19106
- * @index: is %0 for first function found, %1 for the second...
19107
+ * @index: is 0 for first function found, 1 for the second...
19109
* Returns pointer to the node if successful, %NULL otherwise.
19111
@@ -1115,13 +1115,16 @@ static int is_bridge(struct pci_func * func)
19115
- * set_controller_speed - set the frequency and/or mode of a specific controller segment.
19116
+ * set_controller_speed - set the frequency and/or mode of a specific
19117
+ * controller segment.
19119
* @ctrl: controller to change frequency/mode for.
19120
* @adapter_speed: the speed of the adapter we want to match.
19121
* @hp_slot: the slot number where the adapter is installed.
19123
- * Returns %0 if we successfully change frequency and/or mode to match the
19124
+ * Returns 0 if we successfully change frequency and/or mode to match the
19128
static u8 set_controller_speed(struct controller *ctrl, u8 adapter_speed, u8 hp_slot)
19130
@@ -1250,14 +1253,13 @@ static u8 set_controller_speed(struct controller *ctrl, u8 adapter_speed, u8 hp_
19133
* board_replaced - Called after a board has been replaced in the system.
19134
- * @func: PCI device/function information
19135
- * @ctrl: hotplug controller
19137
- * This is only used if we don't have resources for hot add.
19138
- * Turns power on for the board.
19139
- * Checks to see if board is the same.
19140
- * If board is same, reconfigures it.
19141
+ * This is only used if we don't have resources for hot add
19142
+ * Turns power on for the board
19143
+ * Checks to see if board is the same
19144
+ * If board is same, reconfigures it
19145
* If board isn't same, turns it back off.
19148
static u32 board_replaced(struct pci_func *func, struct controller *ctrl)
19150
@@ -1401,11 +1403,10 @@ static u32 board_replaced(struct pci_func *func, struct controller *ctrl)
19153
* board_added - Called after a board has been added to the system.
19154
- * @func: PCI device/function info
19155
- * @ctrl: hotplug controller
19157
- * Turns power on for the board.
19158
- * Configures board.
19159
+ * Turns power on for the board
19160
+ * Configures board
19163
static u32 board_added(struct pci_func *func, struct controller *ctrl)
19165
@@ -1606,10 +1607,8 @@ static u32 board_added(struct pci_func *func, struct controller *ctrl)
19169
- * remove_board - Turns off slot and LEDs
19170
- * @func: PCI device/function info
19171
- * @replace_flag: whether replacing or adding a new device
19172
- * @ctrl: target controller
19173
+ * remove_board - Turns off slot and LED's
19176
static u32 remove_board(struct pci_func * func, u32 replace_flag, struct controller * ctrl)
19178
@@ -1903,11 +1902,11 @@ static void interrupt_event_handler(struct controller *ctrl)
19182
- * cpqhp_pushbutton_thread - handle pushbutton events
19183
- * @slot: target slot (struct)
19184
+ * cpqhp_pushbutton_thread
19186
- * Scheduled procedure to handle blocking stuff for the pushbuttons.
19187
+ * Scheduled procedure to handle blocking stuff for the pushbuttons
19188
* Handles all pending events and exits.
19191
void cpqhp_pushbutton_thread(unsigned long slot)
19193
@@ -2138,10 +2137,9 @@ int cpqhp_process_SS(struct controller *ctrl, struct pci_func *func)
19197
- * switch_leds - switch the leds, go from one site to the other.
19198
+ * switch_leds: switch the leds, go from one site to the other.
19199
* @ctrl: controller to use
19200
* @num_of_slots: number of slots to use
19201
- * @work_LED: LED control value
19202
* @direction: 1 to start from the left side, 0 to start right.
19204
static void switch_leds(struct controller *ctrl, const int num_of_slots,
19205
@@ -2167,11 +2165,11 @@ static void switch_leds(struct controller *ctrl, const int num_of_slots,
19209
- * cpqhp_hardware_test - runs hardware tests
19210
- * @ctrl: target controller
19211
- * @test_num: the number written to the "test" file in sysfs.
19212
+ * hardware_test - runs hardware tests
19214
* For hot plug ctrl folks to play with.
19215
+ * test_num is the number written to the "test" file in sysfs
19218
int cpqhp_hardware_test(struct controller *ctrl, int test_num)
19220
@@ -2251,12 +2249,14 @@ int cpqhp_hardware_test(struct controller *ctrl, int test_num)
19223
* configure_new_device - Configures the PCI header information of one board.
19225
* @ctrl: pointer to controller structure
19226
* @func: pointer to function structure
19227
* @behind_bridge: 1 if this is a recursive call, 0 if not
19228
* @resources: pointer to set of resource lists
19230
- * Returns 0 if success.
19231
+ * Returns 0 if success
19234
static u32 configure_new_device(struct controller * ctrl, struct pci_func * func,
19235
u8 behind_bridge, struct resource_lists * resources)
19236
@@ -2346,13 +2346,15 @@ static u32 configure_new_device(struct controller * ctrl, struct pci_func * func
19239
* configure_new_function - Configures the PCI header information of one device
19241
* @ctrl: pointer to controller structure
19242
* @func: pointer to function structure
19243
* @behind_bridge: 1 if this is a recursive call, 0 if not
19244
* @resources: pointer to set of resource lists
19246
* Calls itself recursively for bridged devices.
19247
- * Returns 0 if success.
19248
+ * Returns 0 if success
19251
static int configure_new_function(struct controller *ctrl, struct pci_func *func,
19253
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
19254
index d7a293e..027f686 100644
19255
--- a/drivers/pci/hotplug/fakephp.c
19256
+++ b/drivers/pci/hotplug/fakephp.c
19257
@@ -165,11 +165,11 @@ static void remove_slot(struct dummy_slot *dslot)
19261
- * pci_rescan_slot - Rescan slot
19262
- * @temp: Device template. Should be set: bus and devfn.
19264
+ * Tries hard not to re-enable already existing devices
19265
+ * also handles scanning of subfunctions
19267
- * Tries hard not to re-enable already existing devices;
19268
- * also handles scanning of subfunctions.
19269
+ * @param temp Device template. Should be set: bus and devfn.
19271
static void pci_rescan_slot(struct pci_dev *temp)
19273
@@ -229,10 +229,10 @@ static void pci_rescan_slot(struct pci_dev *temp)
19277
- * pci_rescan_bus - Rescan PCI bus
19278
- * @bus: the PCI bus to rescan
19279
+ * Rescan PCI bus.
19280
+ * call pci_rescan_slot for each possible function of the bus
19282
- * Call pci_rescan_slot for each possible function of the bus.
19285
static void pci_rescan_bus(const struct pci_bus *bus)
19287
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
19288
index f1e0966..c8cb49c 100644
19289
--- a/drivers/pci/hotplug/pciehp_ctrl.c
19290
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
19291
@@ -208,10 +208,10 @@ static void set_slot_off(struct controller *ctrl, struct slot * pslot)
19294
* board_added - Called after a board has been added to the system.
19295
- * @p_slot: &slot where board is added
19297
- * Turns power on for the board.
19298
- * Configures board.
19299
+ * Turns power on for the board
19300
+ * Configures board
19303
static int board_added(struct slot *p_slot)
19305
@@ -276,8 +276,8 @@ err_exit:
19309
- * remove_board - Turns off slot and LEDs
19310
- * @p_slot: slot where board is being removed
19311
+ * remove_board - Turns off slot and LED's
19314
static int remove_board(struct slot *p_slot)
19316
@@ -319,11 +319,11 @@ struct power_work_info {
19320
- * pciehp_power_thread - handle pushbutton events
19321
- * @work: &struct work_struct describing work to be done
19322
+ * pciehp_pushbutton_thread
19324
- * Scheduled procedure to handle blocking stuff for the pushbuttons.
19325
+ * Scheduled procedure to handle blocking stuff for the pushbuttons
19326
* Handles all pending events and exits.
19329
static void pciehp_power_thread(struct work_struct *work)
19331
diff --git a/drivers/pci/hotplug/rpadlpar_core.c b/drivers/pci/hotplug/rpadlpar_core.c
19332
index b169b0e..deb6b5e 100644
19333
--- a/drivers/pci/hotplug/rpadlpar_core.c
19334
+++ b/drivers/pci/hotplug/rpadlpar_core.c
19335
@@ -100,7 +100,6 @@ static struct device_node *find_dlpar_node(char *drc_name, int *node_type)
19338
* find_php_slot - return hotplug slot structure for device node
19339
- * @dn: target &device_node
19341
* This routine will return the hotplug slot structure
19342
* for a given device node. Note that built-in PCI slots
19343
@@ -294,8 +293,9 @@ static int dlpar_add_vio_slot(char *drc_name, struct device_node *dn)
19344
* dlpar_add_slot - DLPAR add an I/O Slot
19345
* @drc_name: drc-name of newly added slot
19347
- * Make the hotplug module and the kernel aware of a newly added I/O Slot.
19349
+ * Make the hotplug module and the kernel aware
19350
+ * of a newly added I/O Slot.
19353
* -ENODEV Not a valid drc_name
19354
* -EINVAL Slot already added
19355
@@ -339,9 +339,9 @@ exit:
19357
* dlpar_remove_vio_slot - DLPAR remove a virtual I/O Slot
19358
* @drc_name: drc-name of newly added slot
19359
- * @dn: &device_node
19361
- * Remove the kernel and hotplug representations of an I/O Slot.
19362
+ * Remove the kernel and hotplug representations
19363
+ * of an I/O Slot.
19366
* -EINVAL Vio dev doesn't exist
19367
@@ -359,11 +359,11 @@ static int dlpar_remove_vio_slot(char *drc_name, struct device_node *dn)
19371
- * dlpar_remove_pci_slot - DLPAR remove a PCI I/O Slot
19372
+ * dlpar_remove_slot - DLPAR remove a PCI I/O Slot
19373
* @drc_name: drc-name of newly added slot
19374
- * @dn: &device_node
19376
- * Remove the kernel and hotplug representations of a PCI I/O Slot.
19377
+ * Remove the kernel and hotplug representations
19378
+ * of a PCI I/O Slot.
19381
* -ENODEV Not a valid drc_name
19382
@@ -405,7 +405,8 @@ int dlpar_remove_pci_slot(char *drc_name, struct device_node *dn)
19383
* dlpar_remove_slot - DLPAR remove an I/O Slot
19384
* @drc_name: drc-name of newly added slot
19386
- * Remove the kernel and hotplug representations of an I/O Slot.
19387
+ * Remove the kernel and hotplug representations
19388
+ * of an I/O Slot.
19391
* -ENODEV Not a valid drc_name
19392
diff --git a/drivers/pci/hotplug/rpaphp_core.c b/drivers/pci/hotplug/rpaphp_core.c
19393
index 58f1a99..458c08e 100644
19394
--- a/drivers/pci/hotplug/rpaphp_core.c
19395
+++ b/drivers/pci/hotplug/rpaphp_core.c
19396
@@ -54,12 +54,10 @@ module_param(debug, bool, 0644);
19399
* set_attention_status - set attention LED
19400
- * @hotplug_slot: target &hotplug_slot
19401
- * @value: LED control value
19403
* echo 0 > attention -- set LED OFF
19404
* echo 1 > attention -- set LED ON
19405
* echo 2 > attention -- set LED ID(identify, light is blinking)
19408
static int set_attention_status(struct hotplug_slot *hotplug_slot, u8 value)
19410
@@ -101,8 +99,6 @@ static int get_power_status(struct hotplug_slot *hotplug_slot, u8 * value)
19413
* get_attention_status - get attention LED status
19414
- * @hotplug_slot: slot to get status
19415
- * @value: pointer to store status
19417
static int get_attention_status(struct hotplug_slot *hotplug_slot, u8 * value)
19419
@@ -258,11 +254,6 @@ static int is_php_type(char *drc_type)
19422
* is_php_dn() - return 1 if this is a hotpluggable pci slot, else 0
19423
- * @dn: target &device_node
19424
- * @indexes: passed to get_children_props()
19425
- * @names: passed to get_children_props()
19426
- * @types: returned from get_children_props()
19427
- * @power_domains:
19429
* This routine will return true only if the device node is
19430
* a hotpluggable slot. This routine will return false
19431
@@ -288,7 +279,7 @@ static int is_php_dn(struct device_node *dn, const int **indexes,
19434
* rpaphp_add_slot -- declare a hotplug slot to the hotplug subsystem.
19435
- * @dn: device node of slot
19436
+ * @dn device node of slot
19438
* This subroutine will register a hotplugable slot with the
19439
* PCI hotplug infrastructure. This routine is typicaly called
19440
@@ -300,7 +291,7 @@ static int is_php_dn(struct device_node *dn, const int **indexes,
19441
* routine will just return without doing anything, since embedded
19442
* slots cannot be hotplugged.
19444
- * To remove a slot, it suffices to call rpaphp_deregister_slot().
19445
+ * To remove a slot, it suffices to call rpaphp_deregister_slot()
19447
int rpaphp_add_slot(struct device_node *dn)
19449
diff --git a/drivers/pci/hotplug/rpaphp_pci.c b/drivers/pci/hotplug/rpaphp_pci.c
19450
index 0de8453..54ca865 100644
19451
--- a/drivers/pci/hotplug/rpaphp_pci.c
19452
+++ b/drivers/pci/hotplug/rpaphp_pci.c
19453
@@ -79,7 +79,6 @@ static void set_slot_name(struct slot *slot)
19456
* rpaphp_enable_slot - record slot state, config pci device
19457
- * @slot: target &slot
19459
* Initialize values in the slot, and the hotplug_slot info
19460
* structures to indicate if there is a pci card plugged into
19461
diff --git a/drivers/pci/hotplug/shpchp_ctrl.c b/drivers/pci/hotplug/shpchp_ctrl.c
19462
index eb5cac6..d2fc355 100644
19463
--- a/drivers/pci/hotplug/shpchp_ctrl.c
19464
+++ b/drivers/pci/hotplug/shpchp_ctrl.c
19465
@@ -231,10 +231,10 @@ static int fix_bus_speed(struct controller *ctrl, struct slot *pslot,
19468
* board_added - Called after a board has been added to the system.
19469
- * @p_slot: target &slot
19471
- * Turns power on for the board.
19472
- * Configures board.
19473
+ * Turns power on for the board
19474
+ * Configures board
19477
static int board_added(struct slot *p_slot)
19479
@@ -350,8 +350,8 @@ err_exit:
19483
- * remove_board - Turns off slot and LEDs
19484
- * @p_slot: target &slot
19485
+ * remove_board - Turns off slot and LED's
19488
static int remove_board(struct slot *p_slot)
19490
@@ -397,11 +397,11 @@ struct pushbutton_work_info {
19494
- * shpchp_pushbutton_thread - handle pushbutton events
19495
- * @work: &struct work_struct to be handled
19496
+ * shpchp_pushbutton_thread
19498
- * Scheduled procedure to handle blocking stuff for the pushbuttons.
19499
+ * Scheduled procedure to handle blocking stuff for the pushbuttons
19500
* Handles all pending events and exits.
19503
static void shpchp_pushbutton_thread(struct work_struct *work)
19505
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
19506
index 7d18773..1b7b281 100644
19507
--- a/drivers/pci/pci-sysfs.c
19508
+++ b/drivers/pci/pci-sysfs.c
19509
@@ -702,10 +702,8 @@ static int __init pci_sysfs_init(void)
19510
sysfs_initialized = 1;
19511
for_each_pci_dev(pdev) {
19512
retval = pci_create_sysfs_dev_files(pdev);
19514
- pci_dev_put(pdev);
19521
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
19522
index 3c0d8d1..92a8469 100644
19523
--- a/drivers/pci/pcie/aer/aerdrv_core.c
19524
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
19525
@@ -168,11 +168,11 @@ static int find_device_iter(struct device *device, void *data)
19528
* find_source_device - search through device hierarchy for source device
19529
- * @parent: pointer to Root Port pci_dev data structure
19530
+ * @p_dev: pointer to Root Port pci_dev data structure
19531
* @id: device ID of agent who sends an error message to this Root Port
19533
* Invoked when error is detected at the Root Port.
19536
static struct device* find_source_device(struct pci_dev *parent, u16 id)
19538
struct pci_dev *dev = parent;
19539
@@ -286,15 +286,14 @@ static void report_resume(struct pci_dev *dev, void *data)
19542
* broadcast_error_message - handle message broadcast to downstream drivers
19543
- * @dev: pointer to from where in a hierarchy message is broadcasted down
19544
+ * @device: pointer to from where in a hierarchy message is broadcasted down
19545
+ * @api: callback to be broadcasted
19546
* @state: error state
19547
- * @error_mesg: message to print
19548
- * @cb: callback to be broadcasted
19550
* Invoked during error recovery process. Once being invoked, the content
19551
* of error severity will be broadcasted to all downstream drivers in a
19552
* hierarchy in question.
19555
static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
19556
enum pci_channel_state state,
19558
@@ -429,7 +428,7 @@ static pci_ers_result_t reset_link(struct pcie_device *aerdev,
19559
* Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
19560
* error detected message to all downstream drivers within a hierarchy in
19561
* question and return the returned code.
19564
static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
19565
struct pci_dev *dev,
19567
@@ -489,7 +488,7 @@ static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
19568
* @info: comprehensive error information
19570
* Invoked when an error being detected by Root Port.
19573
static void handle_error_source(struct pcie_device * aerdev,
19574
struct pci_dev *dev,
19575
struct aer_err_info info)
19576
@@ -522,7 +521,7 @@ static void handle_error_source(struct pcie_device * aerdev,
19577
* @rpc: pointer to a Root Port data structure
19579
* Invoked when PCIE bus loads AER service driver.
19582
void aer_enable_rootport(struct aer_rpc *rpc)
19584
struct pci_dev *pdev = rpc->rpd->port;
19585
@@ -570,7 +569,7 @@ void aer_enable_rootport(struct aer_rpc *rpc)
19586
* @rpc: pointer to a Root Port data structure
19588
* Invoked when PCIE bus unloads AER service driver.
19591
static void disable_root_aer(struct aer_rpc *rpc)
19593
struct pci_dev *pdev = rpc->rpd->port;
19594
@@ -591,7 +590,7 @@ static void disable_root_aer(struct aer_rpc *rpc)
19595
* @rpc: pointer to the root port which holds an error
19597
* Invoked by DPC handler to consume an error.
19600
static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
19602
struct aer_err_source *e_source;
19603
@@ -656,7 +655,7 @@ static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
19604
* aer_isr_one_error - consume an error detected by root port
19605
* @p_device: pointer to error root port service device
19606
* @e_src: pointer to an error source
19609
static void aer_isr_one_error(struct pcie_device *p_device,
19610
struct aer_err_source *e_src)
19612
@@ -707,7 +706,7 @@ static void aer_isr_one_error(struct pcie_device *p_device,
19613
* @work: definition of this work item
19615
* Invoked, as DPC, when root port records new detected error
19618
void aer_isr(struct work_struct *work)
19620
struct aer_rpc *rpc = container_of(work, struct aer_rpc, dpc_handler);
19621
@@ -730,7 +729,7 @@ void aer_isr(struct work_struct *work)
19622
* @rpc: pointer to a root port device being deleted
19624
* Invoked when AER service unloaded on a specific Root Port
19627
void aer_delete_rootport(struct aer_rpc *rpc)
19629
/* Disable root port AER itself */
19630
@@ -744,7 +743,7 @@ void aer_delete_rootport(struct aer_rpc *rpc)
19631
* @dev: pointer to AER pcie device
19633
* Invoked when AER service driver is loaded.
19636
int aer_init(struct pcie_device *dev)
19638
if (aer_osc_setup(dev) && !forceload)
19639
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
19640
index 26057f9..df38364 100644
19641
--- a/drivers/pci/pcie/portdrv_pci.c
19642
+++ b/drivers/pci/pcie/portdrv_pci.c
19643
@@ -217,7 +217,7 @@ static int slot_reset_iter(struct device *device, void *data)
19645
static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
19647
- pci_ers_result_t status = PCI_ERS_RESULT_NONE;
19648
+ pci_ers_result_t status;
19651
/* If fatal, restore cfg space for possible link reset at upstream */
19652
diff --git a/drivers/pnp/pnpacpi/rsparser.c b/drivers/pnp/pnpacpi/rsparser.c
19653
index 3c5eb37..11adab1 100644
19654
--- a/drivers/pnp/pnpacpi/rsparser.c
19655
+++ b/drivers/pnp/pnpacpi/rsparser.c
19656
@@ -83,11 +83,9 @@ static void pnpacpi_parse_allocated_irqresource(struct pnp_resource_table *res,
19657
while (!(res->irq_resource[i].flags & IORESOURCE_UNSET) &&
19660
- if (i >= PNP_MAX_IRQ) {
19661
- printk(KERN_ERR "pnpacpi: exceeded the max number of IRQ "
19662
- "resources: %d \n", PNP_MAX_IRQ);
19663
+ if (i >= PNP_MAX_IRQ)
19668
* in IO-APIC mode, use overrided attribute. Two reasons:
19669
* 1. BIOS bug in DSDT
19670
@@ -183,9 +181,6 @@ static void pnpacpi_parse_allocated_dmaresource(struct pnp_resource_table *res,
19672
res->dma_resource[i].start = dma;
19673
res->dma_resource[i].end = dma;
19675
- printk(KERN_ERR "pnpacpi: exceeded the max number of DMA "
19676
- "resources: %d \n", PNP_MAX_DMA);
19680
@@ -207,9 +202,6 @@ static void pnpacpi_parse_allocated_ioresource(struct pnp_resource_table *res,
19682
res->port_resource[i].start = io;
19683
res->port_resource[i].end = io + len - 1;
19685
- printk(KERN_ERR "pnpacpi: exceeded the max number of IO "
19686
- "resources: %d \n", PNP_MAX_PORT);
19690
@@ -233,9 +225,6 @@ static void pnpacpi_parse_allocated_memresource(struct pnp_resource_table *res,
19692
res->mem_resource[i].start = mem;
19693
res->mem_resource[i].end = mem + len - 1;
19695
- printk(KERN_ERR "pnpacpi: exceeded the max number of mem "
19696
- "resources: %d\n", PNP_MAX_MEM);
19700
diff --git a/drivers/pnp/resource.c b/drivers/pnp/resource.c
19701
index e50ebcf..41d73a5 100644
19702
--- a/drivers/pnp/resource.c
19703
+++ b/drivers/pnp/resource.c
19704
@@ -367,10 +367,8 @@ int pnp_check_irq(struct pnp_dev *dev, int idx)
19706
struct pci_dev *pci = NULL;
19707
for_each_pci_dev(pci) {
19708
- if (pci->irq == *irq) {
19709
- pci_dev_put(pci);
19710
+ if (pci->irq == *irq)
19716
diff --git a/drivers/ps3/Makefile b/drivers/ps3/Makefile
19717
index 1f5a2d3..746031d 100644
19718
--- a/drivers/ps3/Makefile
19719
+++ b/drivers/ps3/Makefile
19721
-obj-$(CONFIG_PS3_VUART) += ps3-vuart.o
19722
+obj-$(CONFIG_PS3_VUART) += vuart.o
19723
obj-$(CONFIG_PS3_PS3AV) += ps3av_mod.o
19724
ps3av_mod-objs += ps3av.o ps3av_cmd.o
19725
obj-$(CONFIG_PPC_PS3) += sys-manager-core.o
19726
-obj-$(CONFIG_PS3_SYS_MANAGER) += ps3-sys-manager.o
19727
+obj-$(CONFIG_PS3_SYS_MANAGER) += sys-manager.o
19728
obj-$(CONFIG_PS3_STORAGE) += ps3stor_lib.o
19729
diff --git a/drivers/ps3/ps3-sys-manager.c b/drivers/ps3/sys-manager.c
19730
similarity index 100%
19731
rename from drivers/ps3/ps3-sys-manager.c
19732
rename to drivers/ps3/sys-manager.c
19733
diff --git a/drivers/ps3/ps3-vuart.c b/drivers/ps3/vuart.c
19734
similarity index 100%
19735
rename from drivers/ps3/ps3-vuart.c
19736
rename to drivers/ps3/vuart.c
19737
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
19738
index a4f56e9..de0da54 100644
19739
--- a/drivers/rtc/interface.c
19740
+++ b/drivers/rtc/interface.c
19741
@@ -293,7 +293,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct rtc_task *task)
19744
/* Cannot register while the char dev is in use */
19745
- if (test_and_set_bit(RTC_DEV_BUSY, &rtc->flags))
19746
+ if (!(mutex_trylock(&rtc->char_lock)))
19749
spin_lock_irq(&rtc->irq_task_lock);
19750
@@ -303,7 +303,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct rtc_task *task)
19752
spin_unlock_irq(&rtc->irq_task_lock);
19754
- clear_bit(RTC_DEV_BUSY, &rtc->flags);
19755
+ mutex_unlock(&rtc->char_lock);
19759
diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c
19760
index ae1bf17..814583b 100644
19761
--- a/drivers/rtc/rtc-dev.c
19762
+++ b/drivers/rtc/rtc-dev.c
19763
@@ -26,7 +26,10 @@ static int rtc_dev_open(struct inode *inode, struct file *file)
19764
struct rtc_device, char_dev);
19765
const struct rtc_class_ops *ops = rtc->ops;
19767
- if (test_and_set_bit(RTC_DEV_BUSY, &rtc->flags))
19768
+ /* We keep the lock as long as the device is in use
19769
+ * and return immediately if busy
19771
+ if (!(mutex_trylock(&rtc->char_lock)))
19774
file->private_data = rtc;
19775
@@ -40,8 +43,8 @@ static int rtc_dev_open(struct inode *inode, struct file *file)
19779
- /* something has gone wrong */
19780
- clear_bit(RTC_DEV_BUSY, &rtc->flags);
19781
+ /* something has gone wrong, release the lock */
19782
+ mutex_unlock(&rtc->char_lock);
19786
@@ -402,7 +405,7 @@ static int rtc_dev_release(struct inode *inode, struct file *file)
19787
if (rtc->ops->release)
19788
rtc->ops->release(rtc->dev.parent);
19790
- clear_bit(RTC_DEV_BUSY, &rtc->flags);
19791
+ mutex_unlock(&rtc->char_lock);
19795
@@ -437,6 +440,7 @@ void rtc_dev_prepare(struct rtc_device *rtc)
19797
rtc->dev.devt = MKDEV(MAJOR(rtc_devt), rtc->id);
19799
+ mutex_init(&rtc->char_lock);
19800
#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
19801
INIT_WORK(&rtc->uie_task, rtc_uie_task);
19802
setup_timer(&rtc->uie_timer, rtc_uie_timer, (unsigned long)rtc);
19803
diff --git a/drivers/scsi/ide-scsi.c b/drivers/scsi/ide-scsi.c
19804
index 7a835a3..8d0244c 100644
19805
--- a/drivers/scsi/ide-scsi.c
19806
+++ b/drivers/scsi/ide-scsi.c
19807
@@ -242,6 +242,16 @@ static void idescsi_output_buffers (ide_drive_t *drive, idescsi_pc_t *pc, unsign
19811
+static void hexdump(u8 *x, int len)
19816
+ for (i = 0; i < len; i++)
19817
+ printk("%x ", x[i]);
19821
static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_command)
19823
idescsi_scsi_t *scsi = drive_to_idescsi(drive);
19824
@@ -272,8 +282,7 @@ static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_co
19825
pc->scsi_cmd = ((idescsi_pc_t *) failed_command->special)->scsi_cmd;
19826
if (test_bit(IDESCSI_LOG_CMD, &scsi->log)) {
19827
printk ("ide-scsi: %s: queue cmd = ", drive->name);
19828
- print_hex_dump(KERN_CONT, "", DUMP_PREFIX_NONE, 16, 1, pc->c,
19830
+ hexdump(pc->c, 6);
19832
rq->rq_disk = scsi->disk;
19833
return ide_do_drive_cmd(drive, rq, ide_preempt);
19834
@@ -328,8 +337,7 @@ static int idescsi_end_request (ide_drive_t *drive, int uptodate, int nrsecs)
19835
idescsi_pc_t *opc = (idescsi_pc_t *) rq->buffer;
19837
printk ("ide-scsi: %s: wrap up check %lu, rst = ", drive->name, opc->scsi_cmd->serial_number);
19838
- print_hex_dump(KERN_CONT, "", DUMP_PREFIX_NONE, 16, 1,
19839
- pc->buffer, 16, 0);
19840
+ hexdump(pc->buffer,16);
19842
memcpy((void *) opc->scsi_cmd->sense_buffer, pc->buffer, SCSI_SENSE_BUFFERSIZE);
19844
@@ -808,12 +816,10 @@ static int idescsi_queue (struct scsi_cmnd *cmd,
19846
if (test_bit(IDESCSI_LOG_CMD, &scsi->log)) {
19847
printk ("ide-scsi: %s: que %lu, cmd = ", drive->name, cmd->serial_number);
19848
- print_hex_dump(KERN_CONT, "", DUMP_PREFIX_NONE, 16, 1,
19849
- cmd->cmnd, cmd->cmd_len, 0);
19850
+ hexdump(cmd->cmnd, cmd->cmd_len);
19851
if (memcmp(pc->c, cmd->cmnd, cmd->cmd_len)) {
19852
printk ("ide-scsi: %s: que %lu, tsl = ", drive->name, cmd->serial_number);
19853
- print_hex_dump(KERN_CONT, "", DUMP_PREFIX_NONE, 16, 1,
19855
+ hexdump(pc->c, 12);
19859
diff --git a/drivers/scsi/zorro7xx.c b/drivers/scsi/zorro7xx.c
19860
index 64d40a2..ac67394 100644
19861
--- a/drivers/scsi/zorro7xx.c
19862
+++ b/drivers/scsi/zorro7xx.c
19864
#include <linux/init.h>
19865
#include <linux/interrupt.h>
19866
#include <linux/zorro.h>
19868
-#include <asm/amigahw.h>
19869
#include <asm/amigaints.h>
19871
#include <scsi/scsi_host.h>
19872
#include <scsi/scsi_transport_spi.h>
19874
diff --git a/drivers/serial/ip22zilog.c b/drivers/serial/ip22zilog.c
19875
index 9c95bc0..f3257f7 100644
19876
--- a/drivers/serial/ip22zilog.c
19877
+++ b/drivers/serial/ip22zilog.c
19880
#include "ip22zilog.h"
19882
+void ip22_do_break(void);
19885
* On IP22 we need to delay after register accesses but we do not need to
19887
@@ -79,9 +81,12 @@ struct uart_ip22zilog_port {
19888
#define IP22ZILOG_FLAG_REGS_HELD 0x00000040
19889
#define IP22ZILOG_FLAG_TX_STOPPED 0x00000080
19890
#define IP22ZILOG_FLAG_TX_ACTIVE 0x00000100
19891
-#define IP22ZILOG_FLAG_RESET_DONE 0x00000200
19893
- unsigned int tty_break;
19894
+ unsigned int cflag;
19896
+ /* L1-A keyboard break state. */
19900
unsigned char parity_mask;
19901
unsigned char prev_status;
19902
@@ -245,26 +250,13 @@ static void ip22zilog_maybe_update_regs(struct uart_ip22zilog_port *up,
19906
-#define Rx_BRK 0x0100 /* BREAK event software flag. */
19907
-#define Rx_SYS 0x0200 /* SysRq event software flag. */
19909
-static struct tty_struct *ip22zilog_receive_chars(struct uart_ip22zilog_port *up,
19910
- struct zilog_channel *channel)
19911
+static void ip22zilog_receive_chars(struct uart_ip22zilog_port *up,
19912
+ struct zilog_channel *channel)
19914
- struct tty_struct *tty;
19915
- unsigned char ch, flag;
19919
- if (up->port.info != NULL &&
19920
- up->port.info->tty != NULL)
19921
- tty = up->port.info->tty;
19922
+ struct tty_struct *tty = up->port.info->tty; /* XXX info==NULL? */
19925
- ch = readb(&channel->control);
19927
- if (!(ch & Rx_CH_AV))
19930
+ unsigned char ch, r1, flag;
19932
r1 = read_zsreg(channel, R1);
19933
if (r1 & (PAR_ERR | Rx_OVR | CRC_ERR)) {
19934
@@ -273,26 +265,43 @@ static struct tty_struct *ip22zilog_receive_chars(struct uart_ip22zilog_port *up
19938
+ ch = readb(&channel->control);
19941
+ /* This funny hack depends upon BRK_ABRT not interfering
19942
+ * with the other bits we care about in R1.
19944
+ if (ch & BRK_ABRT)
19947
ch = readb(&channel->data);
19950
ch &= up->parity_mask;
19952
- /* Handle the null char got when BREAK is removed. */
19954
- r1 |= up->tty_break;
19955
+ if (ZS_IS_CONS(up) && (r1 & BRK_ABRT)) {
19956
+ /* Wait for BREAK to deassert to avoid potentially
19957
+ * confusing the PROM.
19960
+ ch = readb(&channel->control);
19962
+ if (!(ch & BRK_ABRT))
19969
/* A real serial line, record the character and status. */
19971
up->port.icount.rx++;
19972
- if (r1 & (PAR_ERR | Rx_OVR | CRC_ERR | Rx_SYS | Rx_BRK)) {
19973
- up->tty_break = 0;
19975
- if (r1 & (Rx_SYS | Rx_BRK)) {
19976
- up->port.icount.brk++;
19979
+ if (r1 & (BRK_ABRT | PAR_ERR | Rx_OVR | CRC_ERR)) {
19980
+ if (r1 & BRK_ABRT) {
19981
r1 &= ~(PAR_ERR | CRC_ERR);
19982
+ up->port.icount.brk++;
19983
+ if (uart_handle_break(&up->port))
19986
else if (r1 & PAR_ERR)
19987
up->port.icount.parity++;
19988
@@ -301,21 +310,30 @@ static struct tty_struct *ip22zilog_receive_chars(struct uart_ip22zilog_port *up
19990
up->port.icount.overrun++;
19991
r1 &= up->port.read_status_mask;
19993
+ if (r1 & BRK_ABRT)
19995
else if (r1 & PAR_ERR)
19997
else if (r1 & CRC_ERR)
20001
if (uart_handle_sysrq_char(&up->port, ch))
20006
- uart_insert_char(&up->port, r1, Rx_OVR, ch, flag);
20007
+ if (up->port.ignore_status_mask == 0xff ||
20008
+ (r1 & up->port.ignore_status_mask) == 0)
20009
+ tty_insert_flip_char(tty, ch, flag);
20012
+ tty_insert_flip_char(tty, 0, TTY_OVERRUN);
20014
+ ch = readb(&channel->control);
20016
+ if (!(ch & Rx_CH_AV))
20021
+ tty_flip_buffer_push(tty);
20024
static void ip22zilog_status_handle(struct uart_ip22zilog_port *up,
20025
@@ -330,15 +348,6 @@ static void ip22zilog_status_handle(struct uart_ip22zilog_port *up,
20029
- if (up->curregs[R15] & BRKIE) {
20030
- if ((status & BRK_ABRT) && !(up->prev_status & BRK_ABRT)) {
20031
- if (uart_handle_break(&up->port))
20032
- up->tty_break = Rx_SYS;
20034
- up->tty_break = Rx_BRK;
20038
if (ZS_WANTS_MODEM_STATUS(up)) {
20040
up->port.icount.dsr++;
20041
@@ -347,10 +356,10 @@ static void ip22zilog_status_handle(struct uart_ip22zilog_port *up,
20042
* But it does not tell us which bit has changed, we have to keep
20043
* track of this ourselves.
20045
- if ((status ^ up->prev_status) ^ DCD)
20046
+ if ((status & DCD) ^ up->prev_status)
20047
uart_handle_dcd_change(&up->port,
20049
- if ((status ^ up->prev_status) ^ CTS)
20050
+ if ((status & CTS) ^ up->prev_status)
20051
uart_handle_cts_change(&up->port,
20054
@@ -438,21 +447,19 @@ static irqreturn_t ip22zilog_interrupt(int irq, void *dev_id)
20056
struct zilog_channel *channel
20057
= ZILOG_CHANNEL_FROM_PORT(&up->port);
20058
- struct tty_struct *tty;
20061
spin_lock(&up->port.lock);
20062
r3 = read_zsreg(channel, R3);
20066
if (r3 & (CHAEXT | CHATxIP | CHARxIP)) {
20067
writeb(RES_H_IUS, &channel->control);
20072
- tty = ip22zilog_receive_chars(up, channel);
20073
+ ip22zilog_receive_chars(up, channel);
20075
ip22zilog_status_handle(up, channel);
20077
@@ -460,22 +467,18 @@ static irqreturn_t ip22zilog_interrupt(int irq, void *dev_id)
20079
spin_unlock(&up->port.lock);
20082
- tty_flip_buffer_push(tty);
20086
channel = ZILOG_CHANNEL_FROM_PORT(&up->port);
20088
spin_lock(&up->port.lock);
20090
if (r3 & (CHBEXT | CHBTxIP | CHBRxIP)) {
20091
writeb(RES_H_IUS, &channel->control);
20096
- tty = ip22zilog_receive_chars(up, channel);
20097
+ ip22zilog_receive_chars(up, channel);
20099
ip22zilog_status_handle(up, channel);
20101
@@ -483,9 +486,6 @@ static irqreturn_t ip22zilog_interrupt(int irq, void *dev_id)
20103
spin_unlock(&up->port.lock);
20106
- tty_flip_buffer_push(tty);
20111
@@ -681,46 +681,11 @@ static void ip22zilog_break_ctl(struct uart_port *port, int break_state)
20112
spin_unlock_irqrestore(&port->lock, flags);
20115
-static void __ip22zilog_reset(struct uart_ip22zilog_port *up)
20117
- struct zilog_channel *channel;
20120
- if (up->flags & IP22ZILOG_FLAG_RESET_DONE)
20123
- /* Let pending transmits finish. */
20124
- channel = ZILOG_CHANNEL_FROM_PORT(&up->port);
20125
- for (i = 0; i < 1000; i++) {
20126
- unsigned char stat = read_zsreg(channel, R1);
20127
- if (stat & ALL_SNT)
20132
- if (!ZS_IS_CHANNEL_A(up)) {
20134
- channel = ZILOG_CHANNEL_FROM_PORT(&up->port);
20136
- write_zsreg(channel, R9, FHWRES);
20138
- (void) read_zsreg(channel, R0);
20140
- up->flags |= IP22ZILOG_FLAG_RESET_DONE;
20141
- up->next->flags |= IP22ZILOG_FLAG_RESET_DONE;
20144
static void __ip22zilog_startup(struct uart_ip22zilog_port *up)
20146
struct zilog_channel *channel;
20148
channel = ZILOG_CHANNEL_FROM_PORT(&up->port);
20150
- __ip22zilog_reset(up);
20152
- __load_zsregs(channel, up->curregs);
20153
- /* set master interrupt enable */
20154
- write_zsreg(channel, R9, up->curregs[R9]);
20155
up->prev_status = readb(&channel->control);
20157
/* Enable receiver and transmitter. */
20158
@@ -894,6 +859,8 @@ ip22zilog_set_termios(struct uart_port *port, struct ktermios *termios,
20160
up->flags &= ~IP22ZILOG_FLAG_MODEM_STATUS;
20162
+ up->cflag = termios->c_cflag;
20164
ip22zilog_maybe_update_regs(up, ZILOG_CHANNEL_FROM_PORT(port));
20165
uart_update_timeout(port, termios->c_cflag, baud);
20167
@@ -1025,29 +992,74 @@ ip22zilog_console_write(struct console *con, const char *s, unsigned int count)
20168
spin_unlock_irqrestore(&up->port.lock, flags);
20172
+ip22serial_console_termios(struct console *con, char *options)
20174
+ int baud = 9600, bits = 8, cflag;
20175
+ int parity = 'n';
20179
+ uart_parse_options(options, &baud, &parity, &bits, &flow);
20181
+ cflag = CREAD | HUPCL | CLOCAL;
20184
+ case 150: cflag |= B150; break;
20185
+ case 300: cflag |= B300; break;
20186
+ case 600: cflag |= B600; break;
20187
+ case 1200: cflag |= B1200; break;
20188
+ case 2400: cflag |= B2400; break;
20189
+ case 4800: cflag |= B4800; break;
20190
+ case 9600: cflag |= B9600; break;
20191
+ case 19200: cflag |= B19200; break;
20192
+ case 38400: cflag |= B38400; break;
20193
+ default: baud = 9600; cflag |= B9600; break;
20196
+ con->cflag = cflag | CS8; /* 8N1 */
20198
+ uart_update_timeout(&ip22zilog_port_table[con->index].port, cflag, baud);
20201
static int __init ip22zilog_console_setup(struct console *con, char *options)
20203
struct uart_ip22zilog_port *up = &ip22zilog_port_table[con->index];
20204
unsigned long flags;
20205
- int baud = 9600, bits = 8;
20206
- int parity = 'n';
20210
- up->flags |= IP22ZILOG_FLAG_IS_CONS;
20211
+ printk("Console: ttyS%d (IP22-Zilog)\n", con->index);
20213
- printk(KERN_INFO "Console: ttyS%d (IP22-Zilog)\n", con->index);
20214
+ /* Get firmware console settings. */
20215
+ ip22serial_console_termios(con, options);
20217
+ /* Firmware console speed is limited to 150-->38400 baud so
20218
+ * this hackish cflag thing is OK.
20220
+ switch (con->cflag & CBAUD) {
20221
+ case B150: baud = 150; break;
20222
+ case B300: baud = 300; break;
20223
+ case B600: baud = 600; break;
20224
+ case B1200: baud = 1200; break;
20225
+ case B2400: baud = 2400; break;
20226
+ case B4800: baud = 4800; break;
20227
+ default: case B9600: baud = 9600; break;
20228
+ case B19200: baud = 19200; break;
20229
+ case B38400: baud = 38400; break;
20232
+ brg = BPS_TO_BRG(baud, ZS_CLOCK / ZS_CLOCK_DIVISOR);
20234
spin_lock_irqsave(&up->port.lock, flags);
20236
- up->curregs[R15] |= BRKIE;
20237
+ up->curregs[R15] = BRKIE;
20238
+ ip22zilog_convert_to_zs(up, con->cflag, 0, brg);
20240
__ip22zilog_startup(up);
20242
spin_unlock_irqrestore(&up->port.lock, flags);
20245
- uart_parse_options(options, &baud, &parity, &bits, &flow);
20246
- return uart_set_options(&up->port, con, baud, parity, bits, flow);
20250
static struct uart_driver ip22zilog_reg;
20251
@@ -1128,10 +1140,25 @@ static void __init ip22zilog_prepare(void)
20252
up[(chip * 2) + 1].port.line = (chip * 2) + 1;
20253
up[(chip * 2) + 1].flags |= IP22ZILOG_FLAG_IS_CHANNEL_A;
20257
+static void __init ip22zilog_init_hw(void)
20261
+ for (i = 0; i < NUM_CHANNELS; i++) {
20262
+ struct uart_ip22zilog_port *up = &ip22zilog_port_table[i];
20263
+ struct zilog_channel *channel = ZILOG_CHANNEL_FROM_PORT(&up->port);
20264
+ unsigned long flags;
20267
- for (channel = 0; channel < NUM_CHANNELS; channel++) {
20268
- struct uart_ip22zilog_port *up = &ip22zilog_port_table[channel];
20270
+ spin_lock_irqsave(&up->port.lock, flags);
20272
+ if (ZS_IS_CHANNEL_A(up)) {
20273
+ write_zsreg(channel, R9, FHWRES);
20275
+ (void) read_zsreg(channel, R0);
20278
/* Normal serial TTY. */
20279
up->parity_mask = 0xff;
20280
@@ -1142,10 +1169,16 @@ static void __init ip22zilog_prepare(void)
20281
up->curregs[R9] = NV | MIE;
20282
up->curregs[R10] = NRZ;
20283
up->curregs[R11] = TCBR | RCBR;
20284
- brg = BPS_TO_BRG(9600, ZS_CLOCK / ZS_CLOCK_DIVISOR);
20286
+ brg = BPS_TO_BRG(baud, ZS_CLOCK / ZS_CLOCK_DIVISOR);
20287
up->curregs[R12] = (brg & 0xff);
20288
up->curregs[R13] = (brg >> 8) & 0xff;
20289
up->curregs[R14] = BRENAB;
20290
+ __load_zsregs(channel, up->curregs);
20291
+ /* set master interrupt enable */
20292
+ write_zsreg(channel, R9, up->curregs[R9]);
20294
+ spin_unlock_irqrestore(&up->port.lock, flags);
20298
@@ -1162,6 +1195,8 @@ static int __init ip22zilog_ports_init(void)
20299
panic("IP22-Zilog: Unable to register zs interrupt handler.\n");
20302
+ ip22zilog_init_hw();
20304
ret = uart_register_driver(&ip22zilog_reg);
20307
diff --git a/drivers/serial/pxa.c b/drivers/serial/pxa.c
20308
index 352fcb8..af3a011 100644
20309
--- a/drivers/serial/pxa.c
20310
+++ b/drivers/serial/pxa.c
20311
@@ -585,11 +585,11 @@ serial_pxa_type(struct uart_port *port)
20315
+#ifdef CONFIG_SERIAL_PXA_CONSOLE
20317
static struct uart_pxa_port *serial_pxa_ports[4];
20318
static struct uart_driver serial_pxa_reg;
20320
-#ifdef CONFIG_SERIAL_PXA_CONSOLE
20322
#define BOTH_EMPTY (UART_LSR_TEMT | UART_LSR_THRE)
20325
diff --git a/drivers/spi/atmel_spi.c b/drivers/spi/atmel_spi.c
20326
index ff6a14b..0d342dc 100644
20327
--- a/drivers/spi/atmel_spi.c
20328
+++ b/drivers/spi/atmel_spi.c
20329
@@ -497,7 +497,7 @@ static int atmel_spi_setup(struct spi_device *spi)
20330
/* chipselect must have been muxed as GPIO (e.g. in board setup) */
20331
npcs_pin = (unsigned int)spi->controller_data;
20332
if (!spi->controller_state) {
20333
- ret = gpio_request(npcs_pin, spi->dev.bus_id);
20334
+ ret = gpio_request(npcs_pin, "spi_npcs");
20337
spi->controller_state = (void *)npcs_pin;
20338
diff --git a/drivers/spi/spi_s3c24xx_gpio.c b/drivers/spi/spi_s3c24xx_gpio.c
20339
index 109d82c..0fa25e2 100644
20340
--- a/drivers/spi/spi_s3c24xx_gpio.c
20341
+++ b/drivers/spi/spi_s3c24xx_gpio.c
20342
@@ -96,7 +96,6 @@ static void s3c2410_spigpio_chipselect(struct spi_device *dev, int value)
20344
static int s3c2410_spigpio_probe(struct platform_device *dev)
20346
- struct s3c2410_spigpio_info *info;
20347
struct spi_master *master;
20348
struct s3c2410_spigpio *sp;
20350
@@ -114,11 +113,10 @@ static int s3c2410_spigpio_probe(struct platform_device *dev)
20351
platform_set_drvdata(dev, sp);
20353
/* copy in the plkatform data */
20354
- info = sp->info = dev->dev.platform_data;
20355
+ sp->info = dev->dev.platform_data;
20357
/* setup spi bitbang adaptor */
20358
sp->bitbang.master = spi_master_get(master);
20359
- sp->bitbang.master->bus_num = info->bus_num;
20360
sp->bitbang.chipselect = s3c2410_spigpio_chipselect;
20362
sp->bitbang.txrx_word[SPI_MODE_0] = s3c2410_spigpio_txrx_mode0;
20363
@@ -126,18 +124,13 @@ static int s3c2410_spigpio_probe(struct platform_device *dev)
20364
sp->bitbang.txrx_word[SPI_MODE_2] = s3c2410_spigpio_txrx_mode2;
20365
sp->bitbang.txrx_word[SPI_MODE_3] = s3c2410_spigpio_txrx_mode3;
20367
- /* set state of spi pins, always assume that the clock is
20368
- * available, but do check the MOSI and MISO. */
20369
- s3c2410_gpio_setpin(info->pin_clk, 0);
20370
- s3c2410_gpio_cfgpin(info->pin_clk, S3C2410_GPIO_OUTPUT);
20371
+ /* set state of spi pins */
20372
+ s3c2410_gpio_setpin(sp->info->pin_clk, 0);
20373
+ s3c2410_gpio_setpin(sp->info->pin_mosi, 0);
20375
- if (info->pin_mosi < S3C2410_GPH10) {
20376
- s3c2410_gpio_setpin(info->pin_mosi, 0);
20377
- s3c2410_gpio_cfgpin(info->pin_mosi, S3C2410_GPIO_OUTPUT);
20380
- if (info->pin_miso != S3C2410_GPA0 && info->pin_miso < S3C2410_GPH10)
20381
- s3c2410_gpio_cfgpin(info->pin_miso, S3C2410_GPIO_INPUT);
20382
+ s3c2410_gpio_cfgpin(sp->info->pin_clk, S3C2410_GPIO_OUTPUT);
20383
+ s3c2410_gpio_cfgpin(sp->info->pin_mosi, S3C2410_GPIO_OUTPUT);
20384
+ s3c2410_gpio_cfgpin(sp->info->pin_miso, S3C2410_GPIO_INPUT);
20386
ret = spi_bitbang_start(&sp->bitbang);
20388
diff --git a/drivers/usb/README b/drivers/usb/README
20389
index 284f46b..3c84341 100644
20390
--- a/drivers/usb/README
20391
+++ b/drivers/usb/README
20392
@@ -39,12 +39,12 @@ first subdirectory in the list below that it fits into.
20394
image/ - This is for still image drivers, like scanners or
20396
-../input/ - This is for any driver that uses the input subsystem,
20397
+input/ - This is for any driver that uses the input subsystem,
20398
like keyboard, mice, touchscreens, tablets, etc.
20399
-../media/ - This is for multimedia drivers, like video cameras,
20400
+media/ - This is for multimedia drivers, like video cameras,
20401
radios, and any other drivers that talk to the v4l
20403
-../net/ - This is for network drivers.
20404
+net/ - This is for network drivers.
20405
serial/ - This is for USB to serial drivers.
20406
storage/ - This is for USB mass-storage drivers.
20407
class/ - This is for all USB device drivers that do not fit
20408
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
20409
index c51f8e9..8586817 100644
20410
--- a/drivers/usb/core/driver.c
20411
+++ b/drivers/usb/core/driver.c
20412
@@ -585,6 +585,9 @@ static int usb_uevent(struct device *dev, struct kobj_uevent_env *env)
20414
struct usb_device *usb_dev;
20419
/* driver is often null here; dev_dbg() would oops */
20420
pr_debug ("usb %s: uevent\n", dev->bus_id);
20422
@@ -628,6 +631,14 @@ static int usb_uevent(struct device *dev, struct kobj_uevent_env *env)
20423
usb_dev->descriptor.bDeviceProtocol))
20426
+ if (add_uevent_var(env, "BUSNUM=%03d",
20427
+ usb_dev->bus->busnum))
20430
+ if (add_uevent_var(env, "DEVNUM=%03d",
20431
+ usb_dev->devnum))
20437
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
20438
index d5ed3fa..fea8256 100644
20439
--- a/drivers/usb/core/hcd.c
20440
+++ b/drivers/usb/core/hcd.c
20441
@@ -1311,8 +1311,8 @@ void usb_hcd_flush_endpoint(struct usb_device *udev,
20442
hcd = bus_to_hcd(udev->bus);
20444
/* No more submits can occur */
20445
- spin_lock_irq(&hcd_urb_list_lock);
20447
+ spin_lock_irq(&hcd_urb_list_lock);
20448
list_for_each_entry (urb, &ep->urb_list, urb_list) {
20451
@@ -1345,7 +1345,6 @@ rescan:
20454
/* list contents may have changed */
20455
- spin_lock(&hcd_urb_list_lock);
20458
spin_unlock_irq(&hcd_urb_list_lock);
20459
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
20460
index 13b326a..036c3de 100644
20461
--- a/drivers/usb/core/hub.c
20462
+++ b/drivers/usb/core/hub.c
20463
@@ -335,7 +335,7 @@ static void kick_khubd(struct usb_hub *hub)
20464
to_usb_interface(hub->intfdev)->pm_usage_cnt = 1;
20466
spin_lock_irqsave(&hub_event_lock, flags);
20467
- if (!hub->disconnected && list_empty(&hub->event_list)) {
20468
+ if (!hub->disconnected & list_empty(&hub->event_list)) {
20469
list_add_tail(&hub->event_list, &hub_event_list);
20470
wake_up(&khubd_wait);
20472
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
20473
index fcd40ec..316a746 100644
20474
--- a/drivers/usb/core/message.c
20475
+++ b/drivers/usb/core/message.c
20476
@@ -1172,6 +1172,7 @@ int usb_set_interface(struct usb_device *dev, int interface, int alternate)
20477
struct usb_host_interface *alt;
20482
if (dev->state == USB_STATE_SUSPENDED)
20483
return -EHOSTUNREACH;
20484
@@ -1211,7 +1212,8 @@ int usb_set_interface(struct usb_device *dev, int interface, int alternate)
20487
/* prevent submissions using previous endpoint settings */
20488
- if (iface->cur_altsetting != alt && device_is_registered(&iface->dev))
20489
+ changed = (iface->cur_altsetting != alt);
20490
+ if (changed && device_is_registered(&iface->dev))
20491
usb_remove_sysfs_intf_files(iface);
20492
usb_disable_interface(dev, iface);
20494
@@ -1248,7 +1250,7 @@ int usb_set_interface(struct usb_device *dev, int interface, int alternate)
20495
* (Likewise, EP0 never "halts" on well designed devices.)
20497
usb_enable_interface(dev, iface);
20498
- if (device_is_registered(&iface->dev))
20499
+ if (changed && device_is_registered(&iface->dev))
20500
usb_create_sysfs_intf_files(iface);
20503
@@ -1346,10 +1348,34 @@ static int usb_if_uevent(struct device *dev, struct kobj_uevent_env *env)
20504
struct usb_interface *intf;
20505
struct usb_host_interface *alt;
20510
+ /* driver is often null here; dev_dbg() would oops */
20511
+ pr_debug ("usb %s: uevent\n", dev->bus_id);
20513
intf = to_usb_interface(dev);
20514
usb_dev = interface_to_usbdev(intf);
20515
alt = intf->cur_altsetting;
20517
+#ifdef CONFIG_USB_DEVICEFS
20518
+ if (add_uevent_var(env, "DEVICE=/proc/bus/usb/%03d/%03d",
20519
+ usb_dev->bus->busnum, usb_dev->devnum))
20523
+ if (add_uevent_var(env, "PRODUCT=%x/%x/%x",
20524
+ le16_to_cpu(usb_dev->descriptor.idVendor),
20525
+ le16_to_cpu(usb_dev->descriptor.idProduct),
20526
+ le16_to_cpu(usb_dev->descriptor.bcdDevice)))
20529
+ if (add_uevent_var(env, "TYPE=%d/%d/%d",
20530
+ usb_dev->descriptor.bDeviceClass,
20531
+ usb_dev->descriptor.bDeviceSubClass,
20532
+ usb_dev->descriptor.bDeviceProtocol))
20535
if (add_uevent_var(env, "INTERFACE=%d/%d/%d",
20536
alt->desc.bInterfaceClass,
20537
alt->desc.bInterfaceSubClass,
20538
@@ -1615,6 +1641,12 @@ free_interfaces:
20539
intf->dev.bus_id, ret);
20543
+ /* The driver's probe method can call usb_set_interface(),
20544
+ * which would mean the interface's sysfs files are already
20545
+ * created. Just in case, we'll remove them first.
20547
+ usb_remove_sysfs_intf_files(intf);
20548
usb_create_sysfs_intf_files(intf);
20551
diff --git a/drivers/usb/core/sysfs.c b/drivers/usb/core/sysfs.c
20552
index 32bd130..b04afd0 100644
20553
--- a/drivers/usb/core/sysfs.c
20554
+++ b/drivers/usb/core/sysfs.c
20555
@@ -735,8 +735,6 @@ int usb_create_sysfs_intf_files(struct usb_interface *intf)
20556
struct usb_host_interface *alt = intf->cur_altsetting;
20559
- if (intf->sysfs_files_created)
20561
retval = sysfs_create_group(&dev->kobj, &intf_attr_grp);
20564
@@ -748,7 +746,6 @@ int usb_create_sysfs_intf_files(struct usb_interface *intf)
20565
if (intf->intf_assoc)
20566
retval = sysfs_create_group(&dev->kobj, &intf_assoc_attr_grp);
20567
usb_create_intf_ep_files(intf, udev);
20568
- intf->sysfs_files_created = 1;
20572
@@ -756,11 +753,8 @@ void usb_remove_sysfs_intf_files(struct usb_interface *intf)
20574
struct device *dev = &intf->dev;
20576
- if (!intf->sysfs_files_created)
20578
usb_remove_intf_ep_files(intf);
20579
device_remove_file(dev, &dev_attr_interface);
20580
sysfs_remove_group(&dev->kobj, &intf_attr_grp);
20581
sysfs_remove_group(&intf->dev.kobj, &intf_assoc_attr_grp);
20582
- intf->sysfs_files_created = 0;
20584
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
20585
index 8f14237..c4a6f10 100644
20586
--- a/drivers/usb/core/usb.c
20587
+++ b/drivers/usb/core/usb.c
20588
@@ -192,34 +192,9 @@ static void usb_release_dev(struct device *dev)
20592
-#ifdef CONFIG_HOTPLUG
20593
-static int usb_dev_uevent(struct device *dev, struct kobj_uevent_env *env)
20595
- struct usb_device *usb_dev;
20597
- usb_dev = to_usb_device(dev);
20599
- if (add_uevent_var(env, "BUSNUM=%03d", usb_dev->bus->busnum))
20602
- if (add_uevent_var(env, "DEVNUM=%03d", usb_dev->devnum))
20610
-static int usb_dev_uevent(struct device *dev, struct kobj_uevent_env *env)
20614
-#endif /* CONFIG_HOTPLUG */
20616
struct device_type usb_device_type = {
20617
.name = "usb_device",
20618
.release = usb_release_dev,
20619
- .uevent = usb_dev_uevent,
20623
diff --git a/drivers/usb/gadget/omap_udc.c b/drivers/usb/gadget/omap_udc.c
20624
index d377154..87c4f50 100644
20625
--- a/drivers/usb/gadget/omap_udc.c
20626
+++ b/drivers/usb/gadget/omap_udc.c
20627
@@ -1241,14 +1241,14 @@ static void pullup_enable(struct omap_udc *udc)
20628
udc->gadget.dev.parent->power.power_state = PMSG_ON;
20629
udc->gadget.dev.power.power_state = PMSG_ON;
20630
UDC_SYSCON1_REG |= UDC_PULLUP_EN;
20631
- if (!gadget_is_otg(&udc->gadget) && !cpu_is_omap15xx())
20632
+ if (!gadget_is_otg(udc->gadget) && !cpu_is_omap15xx())
20633
OTG_CTRL_REG |= OTG_BSESSVLD;
20634
UDC_IRQ_EN_REG = UDC_DS_CHG_IE;
20637
static void pullup_disable(struct omap_udc *udc)
20639
- if (!gadget_is_otg(&udc->gadget) && !cpu_is_omap15xx())
20640
+ if (!gadget_is_otg(udc->gadget) && !cpu_is_omap15xx())
20641
OTG_CTRL_REG &= ~OTG_BSESSVLD;
20642
UDC_IRQ_EN_REG = UDC_DS_CHG_IE;
20643
UDC_SYSCON1_REG &= ~UDC_PULLUP_EN;
20644
@@ -1386,7 +1386,7 @@ static void update_otg(struct omap_udc *udc)
20648
- if (!gadget_is_otg(&udc->gadget))
20649
+ if (!gadget_is_otg(udc->gadget))
20652
if (OTG_CTRL_REG & OTG_ID)
20653
diff --git a/drivers/usb/gadget/s3c2410_udc.c b/drivers/usb/gadget/s3c2410_udc.c
20654
index 4ce050c..e3e90f8 100644
20655
--- a/drivers/usb/gadget/s3c2410_udc.c
20656
+++ b/drivers/usb/gadget/s3c2410_udc.c
20657
@@ -52,10 +52,10 @@
20658
#include <asm/arch/irqs.h>
20660
#include <asm/arch/hardware.h>
20661
+#include <asm/arch/regs-clock.h>
20662
#include <asm/arch/regs-gpio.h>
20664
-#include <asm/plat-s3c24xx/regs-udc.h>
20665
-#include <asm/plat-s3c24xx/udc.h>
20666
+#include <asm/arch/regs-udc.h>
20667
+#include <asm/arch/udc.h>
20669
#include <asm/mach-types.h>
20671
@@ -1511,11 +1511,7 @@ static irqreturn_t s3c2410_udc_vbus_irq(int irq, void *_dev)
20672
unsigned int value;
20674
dprintk(DEBUG_NORMAL, "%s()\n", __func__);
20676
- /* some cpus cannot read from an line configured to IRQ! */
20677
- s3c2410_gpio_cfgpin(udc_info->vbus_pin, S3C2410_GPIO_INPUT);
20678
value = s3c2410_gpio_getpin(udc_info->vbus_pin);
20679
- s3c2410_gpio_cfgpin(udc_info->vbus_pin, S3C2410_GPIO_SFN2);
20681
if (udc_info->vbus_pin_inverted)
20683
@@ -1876,9 +1872,9 @@ static int s3c2410_udc_probe(struct platform_device *pdev)
20684
if (udc_info && udc_info->vbus_pin > 0) {
20685
irq = s3c2410_gpio_getirq(udc_info->vbus_pin);
20686
retval = request_irq(irq, s3c2410_udc_vbus_irq,
20687
- IRQF_DISABLED | IRQF_TRIGGER_RISING
20688
- | IRQF_TRIGGER_FALLING | IRQF_SHARED,
20689
- gadget_name, udc);
20690
+ IRQF_DISABLED | IRQF_TRIGGER_RISING
20691
+ | IRQF_TRIGGER_FALLING,
20692
+ gadget_name, udc);
20695
dev_err(dev, "can't get vbus irq %i, err %d\n",
20696
diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
20697
index 49a91c5..177e78e 100644
20698
--- a/drivers/usb/host/Kconfig
20699
+++ b/drivers/usb/host/Kconfig
20700
@@ -156,7 +156,7 @@ config USB_OHCI_HCD_PCI
20702
config USB_OHCI_HCD_SSB
20703
bool "OHCI support for Broadcom SSB OHCI core"
20704
- depends on USB_OHCI_HCD && (SSB = y || SSB = USB_OHCI_HCD) && EXPERIMENTAL
20705
+ depends on USB_OHCI_HCD && (SSB = y || SSB = CONFIG_USB_OHCI_HCD) && EXPERIMENTAL
20708
Support for the Sonics Silicon Backplane (SSB) attached
20709
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
20710
index 5f2d74e..c151444 100644
20711
--- a/drivers/usb/host/ehci-hcd.c
20712
+++ b/drivers/usb/host/ehci-hcd.c
20713
@@ -575,15 +575,12 @@ static int ehci_run (struct usb_hcd *hcd)
20714
* from the companions to the EHCI controller. If any of the
20715
* companions are in the middle of a port reset at the time, it
20716
* could cause trouble. Write-locking ehci_cf_port_reset_rwsem
20717
- * guarantees that no resets are in progress. After we set CF,
20718
- * a short delay lets the hardware catch up; new resets shouldn't
20719
- * be started before the port switching actions could complete.
20720
+ * guarantees that no resets are in progress.
20722
down_write(&ehci_cf_port_reset_rwsem);
20723
hcd->state = HC_STATE_RUNNING;
20724
ehci_writel(ehci, FLAG_CF, &ehci->regs->configured_flag);
20725
ehci_readl(ehci, &ehci->regs->command); /* unblock posted writes */
20727
up_write(&ehci_cf_port_reset_rwsem);
20729
temp = HC_VERSION(ehci_readl(ehci, &ehci->caps->hc_capbase));
20730
diff --git a/drivers/usb/image/microtek.c b/drivers/usb/image/microtek.c
20731
index bc207e3..91e999c 100644
20732
--- a/drivers/usb/image/microtek.c
20733
+++ b/drivers/usb/image/microtek.c
20734
@@ -819,7 +819,7 @@ static int mts_usb_probe(struct usb_interface *intf,
20737
new_desc->host->hostdata[0] = (unsigned long)new_desc;
20738
- if (scsi_add_host(new_desc->host, &dev->dev)) {
20739
+ if (scsi_add_host(new_desc->host, NULL)) {
20743
diff --git a/drivers/usb/misc/adutux.c b/drivers/usb/misc/adutux.c
20744
index 5a2c44e..c567aa7 100644
20745
--- a/drivers/usb/misc/adutux.c
20746
+++ b/drivers/usb/misc/adutux.c
20747
@@ -79,22 +79,12 @@ MODULE_DEVICE_TABLE(usb, device_table);
20749
#define COMMAND_TIMEOUT (2*HZ) /* 60 second timeout for a command */
20752
- * The locking scheme is a vanilla 3-lock:
20753
- * adu_device.buflock: A spinlock, covers what IRQs touch.
20754
- * adutux_mutex: A Static lock to cover open_count. It would also cover
20755
- * any globals, but we don't have them in 2.6.
20756
- * adu_device.mtx: A mutex to hold across sleepers like copy_from_user.
20757
- * It covers all of adu_device, except the open_count
20758
- * and what .buflock covers.
20761
/* Structure to hold all of our device specific stuff */
20762
struct adu_device {
20763
- struct mutex mtx;
20764
+ struct mutex mtx; /* locks this structure */
20765
struct usb_device* udev; /* save off the usb device pointer */
20766
struct usb_interface* interface;
20767
- unsigned int minor; /* the starting minor number for this device */
20768
+ unsigned char minor; /* the starting minor number for this device */
20769
char serial_number[8];
20771
int open_count; /* number of times this port has been opened */
20772
@@ -117,11 +107,8 @@ struct adu_device {
20773
char* interrupt_out_buffer;
20774
struct usb_endpoint_descriptor* interrupt_out_endpoint;
20775
struct urb* interrupt_out_urb;
20776
- int out_urb_finished;
20779
-static DEFINE_MUTEX(adutux_mutex);
20781
static struct usb_driver adu_driver;
20783
static void adu_debug_data(int level, const char *function, int size,
20784
@@ -145,31 +132,27 @@ static void adu_debug_data(int level, const char *function, int size,
20786
static void adu_abort_transfers(struct adu_device *dev)
20788
- unsigned long flags;
20790
dbg(2," %s : enter", __FUNCTION__);
20792
+ if (dev == NULL) {
20793
+ dbg(1," %s : dev is null", __FUNCTION__);
20797
if (dev->udev == NULL) {
20798
dbg(1," %s : udev is null", __FUNCTION__);
20802
- /* shutdown transfer */
20803
+ dbg(2," %s : udev state %d", __FUNCTION__, dev->udev->state);
20804
+ if (dev->udev->state == USB_STATE_NOTATTACHED) {
20805
+ dbg(1," %s : udev is not attached", __FUNCTION__);
20809
- /* XXX Anchor these instead */
20810
- spin_lock_irqsave(&dev->buflock, flags);
20811
- if (!dev->read_urb_finished) {
20812
- spin_unlock_irqrestore(&dev->buflock, flags);
20813
- usb_kill_urb(dev->interrupt_in_urb);
20815
- spin_unlock_irqrestore(&dev->buflock, flags);
20817
- spin_lock_irqsave(&dev->buflock, flags);
20818
- if (!dev->out_urb_finished) {
20819
- spin_unlock_irqrestore(&dev->buflock, flags);
20820
- usb_kill_urb(dev->interrupt_out_urb);
20822
- spin_unlock_irqrestore(&dev->buflock, flags);
20823
+ /* shutdown transfer */
20824
+ usb_unlink_urb(dev->interrupt_in_urb);
20825
+ usb_unlink_urb(dev->interrupt_out_urb);
20828
dbg(2," %s : leave", __FUNCTION__);
20829
@@ -179,6 +162,8 @@ static void adu_delete(struct adu_device *dev)
20831
dbg(2, "%s enter", __FUNCTION__);
20833
+ adu_abort_transfers(dev);
20835
/* free data structures */
20836
usb_free_urb(dev->interrupt_in_urb);
20837
usb_free_urb(dev->interrupt_out_urb);
20838
@@ -254,10 +239,7 @@ static void adu_interrupt_out_callback(struct urb *urb)
20842
- spin_lock(&dev->buflock);
20843
- dev->out_urb_finished = 1;
20844
- wake_up(&dev->write_wait);
20845
- spin_unlock(&dev->buflock);
20846
+ wake_up_interruptible(&dev->write_wait);
20849
adu_debug_data(5, __FUNCTION__, urb->actual_length,
20850
@@ -270,17 +252,12 @@ static int adu_open(struct inode *inode, struct file *file)
20851
struct adu_device *dev = NULL;
20852
struct usb_interface *interface;
20857
dbg(2,"%s : enter", __FUNCTION__);
20859
subminor = iminor(inode);
20861
- if ((retval = mutex_lock_interruptible(&adutux_mutex))) {
20862
- dbg(2, "%s : mutex lock failed", __FUNCTION__);
20863
- goto exit_no_lock;
20866
interface = usb_find_interface(&adu_driver, subminor);
20868
err("%s - error, can't find device for minor %d",
20869
@@ -290,54 +267,54 @@ static int adu_open(struct inode *inode, struct file *file)
20872
dev = usb_get_intfdata(interface);
20873
- if (!dev || !dev->udev) {
20876
goto exit_no_device;
20879
- /* check that nobody else is using the device */
20880
- if (dev->open_count) {
20882
+ /* lock this device */
20883
+ if ((retval = mutex_lock_interruptible(&dev->mtx))) {
20884
+ dbg(2, "%s : mutex lock failed", __FUNCTION__);
20885
goto exit_no_device;
20888
+ /* increment our usage count for the device */
20890
dbg(2,"%s : open count %d", __FUNCTION__, dev->open_count);
20892
/* save device in the file's private structure */
20893
file->private_data = dev;
20895
- /* initialize in direction */
20896
- dev->read_buffer_length = 0;
20898
- /* fixup first read by having urb waiting for it */
20899
- usb_fill_int_urb(dev->interrupt_in_urb,dev->udev,
20900
- usb_rcvintpipe(dev->udev,
20901
- dev->interrupt_in_endpoint->bEndpointAddress),
20902
- dev->interrupt_in_buffer,
20903
- le16_to_cpu(dev->interrupt_in_endpoint->wMaxPacketSize),
20904
- adu_interrupt_in_callback, dev,
20905
- dev->interrupt_in_endpoint->bInterval);
20906
- dev->read_urb_finished = 0;
20907
- if (usb_submit_urb(dev->interrupt_in_urb, GFP_KERNEL))
20908
- dev->read_urb_finished = 1;
20909
- /* we ignore failure */
20910
- /* end of fixup for first read */
20911
+ if (dev->open_count == 1) {
20912
+ /* initialize in direction */
20913
+ dev->read_buffer_length = 0;
20915
- /* initialize out direction */
20916
- dev->out_urb_finished = 1;
20919
+ /* fixup first read by having urb waiting for it */
20920
+ usb_fill_int_urb(dev->interrupt_in_urb,dev->udev,
20921
+ usb_rcvintpipe(dev->udev,
20922
+ dev->interrupt_in_endpoint->bEndpointAddress),
20923
+ dev->interrupt_in_buffer,
20924
+ le16_to_cpu(dev->interrupt_in_endpoint->wMaxPacketSize),
20925
+ adu_interrupt_in_callback, dev,
20926
+ dev->interrupt_in_endpoint->bInterval);
20927
+ /* dev->interrupt_in_urb->transfer_flags |= URB_ASYNC_UNLINK; */
20928
+ dev->read_urb_finished = 0;
20929
+ retval = usb_submit_urb(dev->interrupt_in_urb, GFP_KERNEL);
20931
+ --dev->open_count;
20933
+ mutex_unlock(&dev->mtx);
20936
- mutex_unlock(&adutux_mutex);
20938
dbg(2,"%s : leave, return value %d ", __FUNCTION__, retval);
20943
-static void adu_release_internal(struct adu_device *dev)
20944
+static int adu_release_internal(struct adu_device *dev)
20948
dbg(2," %s : enter", __FUNCTION__);
20950
/* decrement our usage count for the device */
20951
@@ -349,11 +326,12 @@ static void adu_release_internal(struct adu_device *dev)
20954
dbg(2," %s : leave", __FUNCTION__);
20958
static int adu_release(struct inode *inode, struct file *file)
20960
- struct adu_device *dev;
20961
+ struct adu_device *dev = NULL;
20964
dbg(2," %s : enter", __FUNCTION__);
20965
@@ -365,13 +343,15 @@ static int adu_release(struct inode *inode, struct file *file)
20968
dev = file->private_data;
20971
dbg(1," %s : object is NULL", __FUNCTION__);
20976
- mutex_lock(&adutux_mutex); /* not interruptible */
20977
+ /* lock our device */
20978
+ mutex_lock(&dev->mtx); /* not interruptible */
20980
if (dev->open_count <= 0) {
20981
dbg(1," %s : device not opened", __FUNCTION__);
20982
@@ -379,15 +359,19 @@ static int adu_release(struct inode *inode, struct file *file)
20986
- adu_release_internal(dev);
20987
if (dev->udev == NULL) {
20988
/* the device was unplugged before the file was released */
20989
- if (!dev->open_count) /* ... and we're the last user */
20991
+ mutex_unlock(&dev->mtx);
20995
+ /* do the work */
20996
+ retval = adu_release_internal(dev);
21000
- mutex_unlock(&adutux_mutex);
21002
+ mutex_unlock(&dev->mtx);
21003
dbg(2," %s : leave, return value %d", __FUNCTION__, retval);
21006
@@ -409,12 +393,12 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21008
dev = file->private_data;
21009
dbg(2," %s : dev=%p", __FUNCTION__, dev);
21011
+ /* lock this object */
21012
if (mutex_lock_interruptible(&dev->mtx))
21013
return -ERESTARTSYS;
21015
/* verify that the device wasn't unplugged */
21016
- if (dev->udev == NULL) {
21017
+ if (dev->udev == NULL || dev->minor == 0) {
21019
err("No device or device unplugged %d", retval);
21021
@@ -468,7 +452,7 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21024
/* even the primary was empty - we may need to do IO */
21025
- if (!dev->read_urb_finished) {
21026
+ if (dev->interrupt_in_urb->status == -EINPROGRESS) {
21027
/* somebody is doing IO */
21028
spin_unlock_irqrestore(&dev->buflock, flags);
21029
dbg(2," %s : submitted already", __FUNCTION__);
21030
@@ -476,7 +460,6 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21031
/* we must initiate input */
21032
dbg(2," %s : initiate input", __FUNCTION__);
21033
dev->read_urb_finished = 0;
21034
- spin_unlock_irqrestore(&dev->buflock, flags);
21036
usb_fill_int_urb(dev->interrupt_in_urb,dev->udev,
21037
usb_rcvintpipe(dev->udev,
21038
@@ -486,12 +469,15 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21039
adu_interrupt_in_callback,
21041
dev->interrupt_in_endpoint->bInterval);
21042
- retval = usb_submit_urb(dev->interrupt_in_urb, GFP_KERNEL);
21044
- dev->read_urb_finished = 1;
21045
+ retval = usb_submit_urb(dev->interrupt_in_urb, GFP_ATOMIC);
21047
+ spin_unlock_irqrestore(&dev->buflock, flags);
21048
+ dbg(2," %s : submitted OK", __FUNCTION__);
21050
if (retval == -ENOMEM) {
21051
retval = bytes_read ? bytes_read : -ENOMEM;
21053
+ spin_unlock_irqrestore(&dev->buflock, flags);
21054
dbg(2," %s : submit failed", __FUNCTION__);
21057
@@ -500,14 +486,10 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21058
/* we wait for I/O to complete */
21059
set_current_state(TASK_INTERRUPTIBLE);
21060
add_wait_queue(&dev->read_wait, &wait);
21061
- spin_lock_irqsave(&dev->buflock, flags);
21062
- if (!dev->read_urb_finished) {
21063
- spin_unlock_irqrestore(&dev->buflock, flags);
21064
+ if (!dev->read_urb_finished)
21065
timeout = schedule_timeout(COMMAND_TIMEOUT);
21067
- spin_unlock_irqrestore(&dev->buflock, flags);
21069
set_current_state(TASK_RUNNING);
21071
remove_wait_queue(&dev->read_wait, &wait);
21073
if (timeout <= 0) {
21074
@@ -527,23 +509,19 @@ static ssize_t adu_read(struct file *file, __user char *buffer, size_t count,
21076
retval = bytes_read;
21077
/* if the primary buffer is empty then use it */
21078
- spin_lock_irqsave(&dev->buflock, flags);
21079
- if (should_submit && dev->read_urb_finished) {
21080
- dev->read_urb_finished = 0;
21081
- spin_unlock_irqrestore(&dev->buflock, flags);
21082
+ if (should_submit && !dev->interrupt_in_urb->status==-EINPROGRESS) {
21083
usb_fill_int_urb(dev->interrupt_in_urb,dev->udev,
21084
usb_rcvintpipe(dev->udev,
21085
dev->interrupt_in_endpoint->bEndpointAddress),
21086
- dev->interrupt_in_buffer,
21087
- le16_to_cpu(dev->interrupt_in_endpoint->wMaxPacketSize),
21088
- adu_interrupt_in_callback,
21090
- dev->interrupt_in_endpoint->bInterval);
21091
- if (usb_submit_urb(dev->interrupt_in_urb, GFP_KERNEL) != 0)
21092
- dev->read_urb_finished = 1;
21093
+ dev->interrupt_in_buffer,
21094
+ le16_to_cpu(dev->interrupt_in_endpoint->wMaxPacketSize),
21095
+ adu_interrupt_in_callback,
21097
+ dev->interrupt_in_endpoint->bInterval);
21098
+ /* dev->interrupt_in_urb->transfer_flags |= URB_ASYNC_UNLINK; */
21099
+ dev->read_urb_finished = 0;
21100
+ usb_submit_urb(dev->interrupt_in_urb, GFP_KERNEL);
21101
/* we ignore failure */
21103
- spin_unlock_irqrestore(&dev->buflock, flags);
21107
@@ -557,24 +535,24 @@ exit:
21108
static ssize_t adu_write(struct file *file, const __user char *buffer,
21109
size_t count, loff_t *ppos)
21111
- DECLARE_WAITQUEUE(waita, current);
21112
struct adu_device *dev;
21113
size_t bytes_written = 0;
21114
size_t bytes_to_write;
21115
size_t buffer_size;
21116
- unsigned long flags;
21120
dbg(2," %s : enter, count = %Zd", __FUNCTION__, count);
21122
dev = file->private_data;
21124
+ /* lock this object */
21125
retval = mutex_lock_interruptible(&dev->mtx);
21129
/* verify that the device wasn't unplugged */
21130
- if (dev->udev == NULL) {
21131
+ if (dev->udev == NULL || dev->minor == 0) {
21133
err("No device or device unplugged %d", retval);
21135
@@ -586,37 +564,42 @@ static ssize_t adu_write(struct file *file, const __user char *buffer,
21140
while (count > 0) {
21141
- add_wait_queue(&dev->write_wait, &waita);
21142
- set_current_state(TASK_INTERRUPTIBLE);
21143
- spin_lock_irqsave(&dev->buflock, flags);
21144
- if (!dev->out_urb_finished) {
21145
- spin_unlock_irqrestore(&dev->buflock, flags);
21146
+ if (dev->interrupt_out_urb->status == -EINPROGRESS) {
21147
+ timeout = COMMAND_TIMEOUT;
21149
- mutex_unlock(&dev->mtx);
21150
- if (signal_pending(current)) {
21151
+ while (timeout > 0) {
21152
+ if (signal_pending(current)) {
21153
dbg(1," %s : interrupted", __FUNCTION__);
21154
- set_current_state(TASK_RUNNING);
21156
- goto exit_onqueue;
21158
- if (schedule_timeout(COMMAND_TIMEOUT) == 0) {
21159
- dbg(1, "%s - command timed out.", __FUNCTION__);
21160
- retval = -ETIMEDOUT;
21161
- goto exit_onqueue;
21164
- remove_wait_queue(&dev->write_wait, &waita);
21165
+ mutex_unlock(&dev->mtx);
21166
+ timeout = interruptible_sleep_on_timeout(&dev->write_wait, timeout);
21167
retval = mutex_lock_interruptible(&dev->mtx);
21169
retval = bytes_written ? bytes_written : retval;
21172
+ if (timeout > 0) {
21175
+ dbg(1," %s : interrupted timeout: %d", __FUNCTION__, timeout);
21179
+ dbg(1," %s : final timeout: %d", __FUNCTION__, timeout);
21181
+ if (timeout == 0) {
21182
+ dbg(1, "%s - command timed out.", __FUNCTION__);
21183
+ retval = -ETIMEDOUT;
21187
+ dbg(4," %s : in progress, count = %Zd", __FUNCTION__, count);
21189
- dbg(4," %s : in progress, count = %Zd", __FUNCTION__, count);
21191
- spin_unlock_irqrestore(&dev->buflock, flags);
21192
- set_current_state(TASK_RUNNING);
21193
- remove_wait_queue(&dev->write_wait, &waita);
21194
dbg(4," %s : sending, count = %Zd", __FUNCTION__, count);
21196
/* write the data into interrupt_out_buffer from userspace */
21197
@@ -639,12 +622,11 @@ static ssize_t adu_write(struct file *file, const __user char *buffer,
21199
adu_interrupt_out_callback,
21201
- dev->interrupt_out_endpoint->bInterval);
21202
+ dev->interrupt_in_endpoint->bInterval);
21203
+ /* dev->interrupt_in_urb->transfer_flags |= URB_ASYNC_UNLINK; */
21204
dev->interrupt_out_urb->actual_length = bytes_to_write;
21205
- dev->out_urb_finished = 0;
21206
retval = usb_submit_urb(dev->interrupt_out_urb, GFP_KERNEL);
21208
- dev->out_urb_finished = 1;
21209
err("Couldn't submit interrupt_out_urb %d", retval);
21212
@@ -655,17 +637,16 @@ static ssize_t adu_write(struct file *file, const __user char *buffer,
21213
bytes_written += bytes_to_write;
21216
- mutex_unlock(&dev->mtx);
21217
- return bytes_written;
21219
+ retval = bytes_written;
21222
+ /* unlock the device */
21223
mutex_unlock(&dev->mtx);
21226
dbg(2," %s : leave, return value %d", __FUNCTION__, retval);
21230
- remove_wait_queue(&dev->write_wait, &waita);
21234
@@ -850,22 +831,25 @@ static void adu_disconnect(struct usb_interface *interface)
21235
dbg(2," %s : enter", __FUNCTION__);
21237
dev = usb_get_intfdata(interface);
21238
+ usb_set_intfdata(interface, NULL);
21240
- mutex_lock(&dev->mtx); /* not interruptible */
21241
- dev->udev = NULL; /* poison */
21242
minor = dev->minor;
21244
+ /* give back our minor */
21245
usb_deregister_dev(interface, &adu_class);
21246
- mutex_unlock(&dev->mtx);
21249
- mutex_lock(&adutux_mutex);
21250
- usb_set_intfdata(interface, NULL);
21251
+ mutex_lock(&dev->mtx); /* not interruptible */
21253
/* if the device is not opened, then we clean up right now */
21254
dbg(2," %s : open count %d", __FUNCTION__, dev->open_count);
21255
- if (!dev->open_count)
21256
+ if (!dev->open_count) {
21257
+ mutex_unlock(&dev->mtx);
21260
- mutex_unlock(&adutux_mutex);
21262
+ dev->udev = NULL;
21263
+ mutex_unlock(&dev->mtx);
21266
dev_info(&interface->dev, "ADU device adutux%d now disconnected\n",
21267
(minor - ADU_MINOR_BASE));
21268
diff --git a/drivers/usb/misc/usbled.c b/drivers/usb/misc/usbled.c
21269
index 06cb719..49c5c5c 100644
21270
--- a/drivers/usb/misc/usbled.c
21271
+++ b/drivers/usb/misc/usbled.c
21272
@@ -144,14 +144,12 @@ static void led_disconnect(struct usb_interface *interface)
21273
struct usb_led *dev;
21275
dev = usb_get_intfdata (interface);
21276
+ usb_set_intfdata (interface, NULL);
21278
device_remove_file(&interface->dev, &dev_attr_blue);
21279
device_remove_file(&interface->dev, &dev_attr_red);
21280
device_remove_file(&interface->dev, &dev_attr_green);
21282
- /* first remove the files, then set the pointer to NULL */
21283
- usb_set_intfdata (interface, NULL);
21285
usb_put_dev(dev->udev);
21288
diff --git a/drivers/usb/serial/generic.c b/drivers/usb/serial/generic.c
21289
index d415311..9eb4a65 100644
21290
--- a/drivers/usb/serial/generic.c
21291
+++ b/drivers/usb/serial/generic.c
21292
@@ -327,7 +327,6 @@ void usb_serial_generic_read_bulk_callback (struct urb *urb)
21293
struct usb_serial_port *port = (struct usb_serial_port *)urb->context;
21294
unsigned char *data = urb->transfer_buffer;
21295
int status = urb->status;
21296
- unsigned long flags;
21298
dbg("%s - port %d", __FUNCTION__, port->number);
21300
@@ -340,11 +339,11 @@ void usb_serial_generic_read_bulk_callback (struct urb *urb)
21301
usb_serial_debug_data(debug, &port->dev, __FUNCTION__, urb->actual_length, data);
21303
/* Throttle the device if requested by tty */
21304
- spin_lock_irqsave(&port->lock, flags);
21305
+ spin_lock(&port->lock);
21306
if (!(port->throttled = port->throttle_req))
21307
/* Handle data and continue reading from device */
21308
flush_and_resubmit_read_urb(port);
21309
- spin_unlock_irqrestore(&port->lock, flags);
21310
+ spin_unlock(&port->lock);
21312
EXPORT_SYMBOL_GPL(usb_serial_generic_read_bulk_callback);
21314
diff --git a/drivers/usb/serial/keyspan.c b/drivers/usb/serial/keyspan.c
21315
index feba967..1f7ab15 100644
21316
--- a/drivers/usb/serial/keyspan.c
21317
+++ b/drivers/usb/serial/keyspan.c
21318
@@ -1215,14 +1215,12 @@ static int keyspan_chars_in_buffer (struct usb_serial_port *port)
21320
static int keyspan_open (struct usb_serial_port *port, struct file *filp)
21322
- struct keyspan_port_private *p_priv;
21323
- struct keyspan_serial_private *s_priv;
21324
- struct usb_serial *serial = port->serial;
21325
+ struct keyspan_port_private *p_priv;
21326
+ struct keyspan_serial_private *s_priv;
21327
+ struct usb_serial *serial = port->serial;
21328
const struct keyspan_device_details *d_details;
21330
- int baud_rate, device_port;
21332
- unsigned int cflag;
21334
s_priv = usb_get_serial_data(serial);
21335
p_priv = usb_get_serial_port_data(port);
21336
@@ -1265,30 +1263,6 @@ static int keyspan_open (struct usb_serial_port *port, struct file *filp)
21337
/* usb_settoggle(urb->dev, usb_pipeendpoint(urb->pipe), usb_pipeout(urb->pipe), 0); */
21340
- /* get the terminal config for the setup message now so we don't
21341
- * need to send 2 of them */
21343
- cflag = port->tty->termios->c_cflag;
21344
- device_port = port->number - port->serial->minor;
21346
- /* Baud rate calculation takes baud rate as an integer
21347
- so other rates can be generated if desired. */
21348
- baud_rate = tty_get_baud_rate(port->tty);
21349
- /* If no match or invalid, leave as default */
21350
- if (baud_rate >= 0
21351
- && d_details->calculate_baud_rate(baud_rate, d_details->baudclk,
21352
- NULL, NULL, NULL, device_port) == KEYSPAN_BAUD_RATE_OK) {
21353
- p_priv->baud = baud_rate;
21356
- /* set CTS/RTS handshake etc. */
21357
- p_priv->cflag = cflag;
21358
- p_priv->flow_control = (cflag & CRTSCTS)? flow_cts: flow_none;
21360
- keyspan_send_setup(port, 1);
21362
- //keyspan_set_termios(port, NULL);
21367
diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
21368
index c29c912..a5ced7e 100644
21369
--- a/drivers/usb/serial/mos7840.c
21370
+++ b/drivers/usb/serial/mos7840.c
21371
@@ -2711,7 +2711,7 @@ static int mos7840_startup(struct usb_serial *serial)
21372
status = mos7840_set_reg_sync(serial->port[0], ZLP_REG5, Data);
21374
dbg("Writing ZLP_REG5 failed status-0x%x\n", status);
21378
dbg("ZLP_REG5 Writing success status%d\n", status);
21380
diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
21381
index cf8add9..2cd3f1d 100644
21382
--- a/drivers/usb/serial/pl2303.c
21383
+++ b/drivers/usb/serial/pl2303.c
21384
@@ -86,7 +86,6 @@ static struct usb_device_id id_table [] = {
21385
{ USB_DEVICE(ALCOR_VENDOR_ID, ALCOR_PRODUCT_ID) },
21386
{ USB_DEVICE(HUAWEI_VENDOR_ID, HUAWEI_PRODUCT_ID) },
21387
{ USB_DEVICE(WS002IN_VENDOR_ID, WS002IN_PRODUCT_ID) },
21388
- { USB_DEVICE(COREGA_VENDOR_ID, COREGA_PRODUCT_ID) },
21389
{ } /* Terminating entry */
21392
diff --git a/drivers/usb/serial/pl2303.h b/drivers/usb/serial/pl2303.h
21393
index d31f5d2..ed603e3 100644
21394
--- a/drivers/usb/serial/pl2303.h
21395
+++ b/drivers/usb/serial/pl2303.h
21396
@@ -104,6 +104,3 @@
21397
#define WS002IN_VENDOR_ID 0x11f6
21398
#define WS002IN_PRODUCT_ID 0x2001
21400
-/* Corega CG-USBRS232R Serial Adapter */
21401
-#define COREGA_VENDOR_ID 0x07aa
21402
-#define COREGA_PRODUCT_ID 0x002a
21403
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
21404
index 605ebcc..833f6e1 100644
21405
--- a/drivers/usb/serial/sierra.c
21406
+++ b/drivers/usb/serial/sierra.c
21407
@@ -136,8 +136,6 @@ static struct usb_device_id id_table_3port [] = {
21408
{ USB_DEVICE(0x0f30, 0x1b1d) }, /* Sierra Wireless MC5720 */
21409
{ USB_DEVICE(0x1199, 0x0218) }, /* Sierra Wireless MC5720 */
21410
{ USB_DEVICE(0x1199, 0x0020) }, /* Sierra Wireless MC5725 */
21411
- { USB_DEVICE(0x1199, 0x0220) }, /* Sierra Wireless MC5725 */
21412
- { USB_DEVICE(0x1199, 0x0220) }, /* Sierra Wireless MC5725 */
21413
{ USB_DEVICE(0x1199, 0x0019) }, /* Sierra Wireless AirCard 595 */
21414
{ USB_DEVICE(0x1199, 0x0021) }, /* Sierra Wireless AirCard 597E */
21415
{ USB_DEVICE(0x1199, 0x0120) }, /* Sierra Wireless USB Dongle 595U*/
21416
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
21417
index 836a34a..1ba19ea 100644
21418
--- a/drivers/usb/storage/scsiglue.c
21419
+++ b/drivers/usb/storage/scsiglue.c
21420
@@ -177,10 +177,6 @@ static int slave_configure(struct scsi_device *sdev)
21421
* is an occasional series of retries that will all fail. */
21422
sdev->retry_hwerror = 1;
21424
- /* USB disks should allow restart. Some drives spin down
21425
- * automatically, requiring a START-STOP UNIT command. */
21426
- sdev->allow_restart = 1;
21430
/* Non-disk-type devices don't need to blacklist any pages
21431
diff --git a/drivers/usb/storage/unusual_devs.h b/drivers/usb/storage/unusual_devs.h
21432
index 2c27721..22ab238 100644
21433
--- a/drivers/usb/storage/unusual_devs.h
21434
+++ b/drivers/usb/storage/unusual_devs.h
21435
@@ -342,11 +342,11 @@ UNUSUAL_DEV( 0x04b0, 0x040d, 0x0100, 0x0100,
21436
US_FL_FIX_CAPACITY),
21438
/* Reported by Graber and Mike Pagano <mpagano-kernel@mpagano.com> */
21439
-UNUSUAL_DEV( 0x04b0, 0x040f, 0x0100, 0x0200,
21441
- "NIKON DSC D200",
21442
- US_SC_DEVICE, US_PR_DEVICE, NULL,
21443
- US_FL_FIX_CAPACITY),
21444
+UNUSUAL_DEV( 0x04b0, 0x040f, 0x0200, 0x0200,
21446
+ "NIKON DSC D200",
21447
+ US_SC_DEVICE, US_PR_DEVICE, NULL,
21448
+ US_FL_FIX_CAPACITY),
21450
/* Reported by Emil Larsson <emil@swip.net> */
21451
UNUSUAL_DEV( 0x04b0, 0x0411, 0x0100, 0x0101,
21452
@@ -731,13 +731,6 @@ UNUSUAL_DEV( 0x0584, 0x0008, 0x0102, 0x0102,
21453
US_SC_SCSI, US_PR_ALAUDA, init_alauda, 0 ),
21456
-/* Reported by RTE <raszilki@yandex.ru> */
21457
-UNUSUAL_DEV( 0x058f, 0x6387, 0x0141, 0x0141,
21460
- US_SC_DEVICE, US_PR_DEVICE, NULL,
21461
- US_FL_MAX_SECTORS_64 ),
21463
/* Fabrizio Fellini <fello@libero.it> */
21464
UNUSUAL_DEV( 0x0595, 0x4343, 0x0000, 0x2210,
21466
diff --git a/drivers/video/Kconfig b/drivers/video/Kconfig
21467
index 5b3dbcf..7d86e9e 100644
21468
--- a/drivers/video/Kconfig
21469
+++ b/drivers/video/Kconfig
21470
@@ -641,17 +641,6 @@ config FB_VESA
21471
You will get a boot time penguin logo at no additional cost. Please
21472
read <file:Documentation/fb/vesafb.txt>. If unsure, say Y.
21475
- bool "EFI-based Framebuffer Support"
21476
- depends on (FB = y) && X86
21477
- select FB_CFB_FILLRECT
21478
- select FB_CFB_COPYAREA
21479
- select FB_CFB_IMAGEBLIT
21481
- This is the EFI frame buffer device driver. If the firmware on
21482
- your platform is UEFI2.0, select Y to add support for
21483
- Graphics Output Protocol for early console messages to appear.
21486
bool "Intel-based Macintosh Framebuffer Support"
21487
depends on (FB = y) && X86 && EFI
21488
diff --git a/drivers/video/Makefile b/drivers/video/Makefile
21489
index 83e02b3..59d6c45 100644
21490
--- a/drivers/video/Makefile
21491
+++ b/drivers/video/Makefile
21492
@@ -118,7 +118,6 @@ obj-$(CONFIG_FB_OMAP) += omap/
21493
obj-$(CONFIG_FB_UVESA) += uvesafb.o
21494
obj-$(CONFIG_FB_VESA) += vesafb.o
21495
obj-$(CONFIG_FB_IMAC) += imacfb.o
21496
-obj-$(CONFIG_FB_EFI) += efifb.o
21497
obj-$(CONFIG_FB_VGA16) += vga16fb.o
21498
obj-$(CONFIG_FB_OF) += offb.o
21499
obj-$(CONFIG_FB_BF54X_LQ043) += bf54x-lq043fb.o
21500
diff --git a/drivers/video/atmel_lcdfb.c b/drivers/video/atmel_lcdfb.c
21501
index 11a3a22..235b618 100644
21502
--- a/drivers/video/atmel_lcdfb.c
21503
+++ b/drivers/video/atmel_lcdfb.c
21504
@@ -268,10 +268,6 @@ static int atmel_lcdfb_set_par(struct fb_info *info)
21505
/* Turn off the LCD controller and the DMA controller */
21506
lcdc_writel(sinfo, ATMEL_LCDC_PWRCON, sinfo->guard_time << ATMEL_LCDC_GUARDT_OFFSET);
21508
- /* Wait for the LCDC core to become idle */
21509
- while (lcdc_readl(sinfo, ATMEL_LCDC_PWRCON) & ATMEL_LCDC_BUSY)
21512
lcdc_writel(sinfo, ATMEL_LCDC_DMACON, 0);
21514
if (info->var.bits_per_pixel == 1)
21515
diff --git a/drivers/video/efifb.c b/drivers/video/efifb.c
21516
deleted file mode 100644
21517
index bd779ae..0000000
21518
--- a/drivers/video/efifb.c
21522
- * Framebuffer driver for EFI/UEFI based system
21524
- * (c) 2006 Edgar Hucek <gimli@dark-green.com>
21525
- * Original efi driver written by Gerd Knorr <kraxel@goldbach.in-berlin.de>
21529
-#include <linux/module.h>
21530
-#include <linux/kernel.h>
21531
-#include <linux/errno.h>
21532
-#include <linux/fb.h>
21533
-#include <linux/platform_device.h>
21534
-#include <linux/screen_info.h>
21536
-#include <video/vga.h>
21538
-static struct fb_var_screeninfo efifb_defined __initdata = {
21539
- .activate = FB_ACTIVATE_NOW,
21542
- .right_margin = 32,
21543
- .upper_margin = 16,
21544
- .lower_margin = 4,
21546
- .vmode = FB_VMODE_NONINTERLACED,
21549
-static struct fb_fix_screeninfo efifb_fix __initdata = {
21551
- .type = FB_TYPE_PACKED_PIXELS,
21552
- .accel = FB_ACCEL_NONE,
21553
- .visual = FB_VISUAL_TRUECOLOR,
21556
-static int efifb_setcolreg(unsigned regno, unsigned red, unsigned green,
21557
- unsigned blue, unsigned transp,
21558
- struct fb_info *info)
21561
- * Set a single color register. The values supplied are
21562
- * already rounded down to the hardware's capabilities
21563
- * (according to the entries in the `var' structure). Return
21564
- * != 0 for invalid regno.
21567
- if (regno >= info->cmap.len)
21570
- if (regno < 16) {
21574
- ((u32 *)(info->pseudo_palette))[regno] =
21575
- (red << info->var.red.offset) |
21576
- (green << info->var.green.offset) |
21577
- (blue << info->var.blue.offset);
21582
-static struct fb_ops efifb_ops = {
21583
- .owner = THIS_MODULE,
21584
- .fb_setcolreg = efifb_setcolreg,
21585
- .fb_fillrect = cfb_fillrect,
21586
- .fb_copyarea = cfb_copyarea,
21587
- .fb_imageblit = cfb_imageblit,
21590
-static int __init efifb_probe(struct platform_device *dev)
21592
- struct fb_info *info;
21594
- unsigned int size_vmode;
21595
- unsigned int size_remap;
21596
- unsigned int size_total;
21598
- efifb_fix.smem_start = screen_info.lfb_base;
21599
- efifb_defined.bits_per_pixel = screen_info.lfb_depth;
21600
- efifb_defined.xres = screen_info.lfb_width;
21601
- efifb_defined.yres = screen_info.lfb_height;
21602
- efifb_fix.line_length = screen_info.lfb_linelength;
21604
- /* size_vmode -- that is the amount of memory needed for the
21605
- * used video mode, i.e. the minimum amount of
21606
- * memory we need. */
21607
- size_vmode = efifb_defined.yres * efifb_fix.line_length;
21609
- /* size_total -- all video memory we have. Used for
21610
- * entries, ressource allocation and bounds
21612
- size_total = screen_info.lfb_size;
21613
- if (size_total < size_vmode)
21614
- size_total = size_vmode;
21616
- /* size_remap -- the amount of video memory we are going to
21617
- * use for efifb. With modern cards it is no
21618
- * option to simply use size_total as that
21619
- * wastes plenty of kernel address space. */
21620
- size_remap = size_vmode * 2;
21621
- if (size_remap < size_vmode)
21622
- size_remap = size_vmode;
21623
- if (size_remap > size_total)
21624
- size_remap = size_total;
21625
- efifb_fix.smem_len = size_remap;
21627
- if (!request_mem_region(efifb_fix.smem_start, size_total, "efifb"))
21628
- /* We cannot make this fatal. Sometimes this comes from magic
21629
- spaces our resource handlers simply don't know about */
21630
- printk(KERN_WARNING
21631
- "efifb: cannot reserve video memory at 0x%lx\n",
21632
- efifb_fix.smem_start);
21634
- info = framebuffer_alloc(sizeof(u32) * 16, &dev->dev);
21637
- goto err_release_mem;
21639
- info->pseudo_palette = info->par;
21640
- info->par = NULL;
21642
- info->screen_base = ioremap(efifb_fix.smem_start, efifb_fix.smem_len);
21643
- if (!info->screen_base) {
21644
- printk(KERN_ERR "efifb: abort, cannot ioremap video memory "
21645
- "0x%x @ 0x%lx\n",
21646
- efifb_fix.smem_len, efifb_fix.smem_start);
21651
- printk(KERN_INFO "efifb: framebuffer at 0x%lx, mapped to 0x%p, "
21652
- "using %dk, total %dk\n",
21653
- efifb_fix.smem_start, info->screen_base,
21654
- size_remap/1024, size_total/1024);
21655
- printk(KERN_INFO "efifb: mode is %dx%dx%d, linelength=%d, pages=%d\n",
21656
- efifb_defined.xres, efifb_defined.yres,
21657
- efifb_defined.bits_per_pixel, efifb_fix.line_length,
21658
- screen_info.pages);
21660
- efifb_defined.xres_virtual = efifb_defined.xres;
21661
- efifb_defined.yres_virtual = efifb_fix.smem_len /
21662
- efifb_fix.line_length;
21663
- printk(KERN_INFO "efifb: scrolling: redraw\n");
21664
- efifb_defined.yres_virtual = efifb_defined.yres;
21666
- /* some dummy values for timing to make fbset happy */
21667
- efifb_defined.pixclock = 10000000 / efifb_defined.xres *
21668
- 1000 / efifb_defined.yres;
21669
- efifb_defined.left_margin = (efifb_defined.xres / 8) & 0xf8;
21670
- efifb_defined.hsync_len = (efifb_defined.xres / 8) & 0xf8;
21672
- efifb_defined.red.offset = screen_info.red_pos;
21673
- efifb_defined.red.length = screen_info.red_size;
21674
- efifb_defined.green.offset = screen_info.green_pos;
21675
- efifb_defined.green.length = screen_info.green_size;
21676
- efifb_defined.blue.offset = screen_info.blue_pos;
21677
- efifb_defined.blue.length = screen_info.blue_size;
21678
- efifb_defined.transp.offset = screen_info.rsvd_pos;
21679
- efifb_defined.transp.length = screen_info.rsvd_size;
21681
- printk(KERN_INFO "efifb: %s: "
21682
- "size=%d:%d:%d:%d, shift=%d:%d:%d:%d\n",
21684
- screen_info.rsvd_size,
21685
- screen_info.red_size,
21686
- screen_info.green_size,
21687
- screen_info.blue_size,
21688
- screen_info.rsvd_pos,
21689
- screen_info.red_pos,
21690
- screen_info.green_pos,
21691
- screen_info.blue_pos);
21693
- efifb_fix.ypanstep = 0;
21694
- efifb_fix.ywrapstep = 0;
21696
- info->fbops = &efifb_ops;
21697
- info->var = efifb_defined;
21698
- info->fix = efifb_fix;
21699
- info->flags = FBINFO_FLAG_DEFAULT;
21701
- if (fb_alloc_cmap(&info->cmap, 256, 0) < 0) {
21705
- if (register_framebuffer(info) < 0) {
21707
- goto err_fb_dealoc;
21709
- printk(KERN_INFO "fb%d: %s frame buffer device\n",
21710
- info->node, info->fix.id);
21714
- fb_dealloc_cmap(&info->cmap);
21716
- iounmap(info->screen_base);
21717
- framebuffer_release(info);
21719
- release_mem_region(efifb_fix.smem_start, size_total);
21723
-static struct platform_driver efifb_driver = {
21724
- .probe = efifb_probe,
21730
-static struct platform_device efifb_device = {
21734
-static int __init efifb_init(void)
21738
- if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI)
21741
- ret = platform_driver_register(&efifb_driver);
21744
- ret = platform_device_register(&efifb_device);
21746
- platform_driver_unregister(&efifb_driver);
21750
-module_init(efifb_init);
21752
-MODULE_LICENSE("GPL");
21753
diff --git a/drivers/video/fb_ddc.c b/drivers/video/fb_ddc.c
21754
index a0df632..f836137 100644
21755
--- a/drivers/video/fb_ddc.c
21756
+++ b/drivers/video/fb_ddc.c
21757
@@ -56,12 +56,13 @@ unsigned char *fb_ddc_read(struct i2c_adapter *adapter)
21760
algo_data->setscl(algo_data->data, 1);
21761
+ algo_data->setscl(algo_data->data, 0);
21763
for (i = 0; i < 3; i++) {
21764
/* For some old monitors we need the
21765
* following process to initialize/stop DDC
21767
- algo_data->setsda(algo_data->data, 1);
21768
+ algo_data->setsda(algo_data->data, 0);
21771
algo_data->setscl(algo_data->data, 1);
21772
@@ -96,15 +97,14 @@ unsigned char *fb_ddc_read(struct i2c_adapter *adapter)
21773
algo_data->setsda(algo_data->data, 1);
21775
algo_data->setscl(algo_data->data, 0);
21776
- algo_data->setsda(algo_data->data, 0);
21780
/* Release the DDC lines when done or the Apple Cinema HD display
21783
- algo_data->setsda(algo_data->data, 1);
21784
- algo_data->setscl(algo_data->data, 1);
21785
+ algo_data->setsda(algo_data->data, 0);
21786
+ algo_data->setscl(algo_data->data, 0);
21790
diff --git a/drivers/video/imacfb.c b/drivers/video/imacfb.c
21791
index 9366ef2..6455fd2 100644
21792
--- a/drivers/video/imacfb.c
21793
+++ b/drivers/video/imacfb.c
21794
@@ -234,6 +234,10 @@ static int __init imacfb_probe(struct platform_device *dev)
21795
size_remap = size_total;
21796
imacfb_fix.smem_len = size_remap;
21799
+ screen_info.imacpm_seg = 0;
21802
if (!request_mem_region(imacfb_fix.smem_start, size_total, "imacfb")) {
21803
printk(KERN_WARNING
21804
"imacfb: cannot reserve video memory at 0x%lx\n",
21805
diff --git a/drivers/video/ps3fb.c b/drivers/video/ps3fb.c
21806
index 9c56c49..75836aa 100644
21807
--- a/drivers/video/ps3fb.c
21808
+++ b/drivers/video/ps3fb.c
21810
#define L1GPU_DISPLAY_SYNC_HSYNC 1
21811
#define L1GPU_DISPLAY_SYNC_VSYNC 2
21813
+#define DDR_SIZE (0) /* used no ddr */
21814
#define GPU_CMD_BUF_SIZE (64 * 1024)
21815
#define GPU_IOIF (0x0d000000UL)
21816
#define GPU_ALIGN_UP(x) _ALIGN_UP((x), 64)
21817
@@ -1059,7 +1060,6 @@ static int __devinit ps3fb_probe(struct ps3_system_bus_device *dev)
21819
int status, res_index;
21820
struct task_struct *task;
21821
- unsigned long max_ps3fb_size;
21823
status = ps3_open_hv_device(dev);
21825
@@ -1085,15 +1085,8 @@ static int __devinit ps3fb_probe(struct ps3_system_bus_device *dev)
21827
ps3fb_set_sync(&dev->core);
21829
- max_ps3fb_size = _ALIGN_UP(GPU_IOIF, 256*1024*1024) - GPU_IOIF;
21830
- if (ps3fb_videomemory.size > max_ps3fb_size) {
21831
- dev_info(&dev->core, "Limiting ps3fb mem size to %lu bytes\n",
21833
- ps3fb_videomemory.size = max_ps3fb_size;
21836
/* get gpu context handle */
21837
- status = lv1_gpu_memory_allocate(ps3fb_videomemory.size, 0, 0, 0, 0,
21838
+ status = lv1_gpu_memory_allocate(DDR_SIZE, 0, 0, 0, 0,
21839
&ps3fb.memory_handle, &ddr_lpar);
21841
dev_err(&dev->core, "%s: lv1_gpu_memory_allocate failed: %d\n",
21842
diff --git a/fs/Kconfig b/fs/Kconfig
21843
index 635f3e2..429a002 100644
21846
@@ -459,15 +459,6 @@ config OCFS2_DEBUG_MASKLOG
21847
This option will enlarge your kernel, but it allows debugging of
21848
ocfs2 filesystem issues.
21850
-config OCFS2_DEBUG_FS
21851
- bool "OCFS2 expensive checks"
21852
- depends on OCFS2_FS
21855
- This option will enable expensive consistency checks. Enable
21856
- this option for debugging only as it is likely to decrease
21857
- performance of the filesystem.
21860
tristate "Minix fs support"
21862
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
21863
index e8b7c3a..bd26e4c 100644
21864
--- a/fs/compat_ioctl.c
21865
+++ b/fs/compat_ioctl.c
21866
@@ -1954,12 +1954,6 @@ ULONG_IOCTL(TIOCSCTTY)
21867
COMPATIBLE_IOCTL(TIOCGPTN)
21868
COMPATIBLE_IOCTL(TIOCSPTLCK)
21869
COMPATIBLE_IOCTL(TIOCSERGETLSR)
21871
-COMPATIBLE_IOCTL(TCGETS2)
21872
-COMPATIBLE_IOCTL(TCSETS2)
21873
-COMPATIBLE_IOCTL(TCSETSW2)
21874
-COMPATIBLE_IOCTL(TCSETSF2)
21877
COMPATIBLE_IOCTL(FIOCLEX)
21878
COMPATIBLE_IOCTL(FIONCLEX)
21879
diff --git a/fs/exec.c b/fs/exec.c
21880
index 282240a..4ccaaa4 100644
21883
@@ -1780,12 +1780,6 @@ int do_coredump(long signr, int exit_code, struct pt_regs * regs)
21884
but keep the previous behaviour for now. */
21885
if (!ispipe && !S_ISREG(inode->i_mode))
21888
- * Dont allow local users get cute and trick others to coredump
21889
- * into their pre-created files:
21891
- if (inode->i_uid != current->fsuid)
21895
if (!file->f_op->write)
21896
diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
21897
index c87ae29..7730388 100644
21898
--- a/fs/ext2/ext2.h
21899
+++ b/fs/ext2/ext2.h
21900
@@ -178,10 +178,3 @@ extern const struct inode_operations ext2_special_inode_operations;
21902
extern const struct inode_operations ext2_fast_symlink_inode_operations;
21903
extern const struct inode_operations ext2_symlink_inode_operations;
21905
-static inline ext2_fsblk_t
21906
-ext2_group_first_block_no(struct super_block *sb, unsigned long group_no)
21908
- return group_no * (ext2_fsblk_t)EXT2_BLOCKS_PER_GROUP(sb) +
21909
- le32_to_cpu(EXT2_SB(sb)->s_es->s_first_data_block);
21911
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
21912
index 80d2f52..3763757 100644
21913
--- a/fs/fuse/dir.c
21914
+++ b/fs/fuse/dir.c
21915
@@ -132,21 +132,6 @@ static void fuse_lookup_init(struct fuse_req *req, struct inode *dir,
21916
req->out.args[0].value = outarg;
21919
-static u64 fuse_get_attr_version(struct fuse_conn *fc)
21921
- u64 curr_version;
21924
- * The spin lock isn't actually needed on 64bit archs, but we
21925
- * don't yet care too much about such optimizations.
21927
- spin_lock(&fc->lock);
21928
- curr_version = fc->attr_version;
21929
- spin_unlock(&fc->lock);
21931
- return curr_version;
21935
* Check whether the dentry is still valid
21937
@@ -186,7 +171,9 @@ static int fuse_dentry_revalidate(struct dentry *entry, struct nameidata *nd)
21941
- attr_version = fuse_get_attr_version(fc);
21942
+ spin_lock(&fc->lock);
21943
+ attr_version = fc->attr_version;
21944
+ spin_unlock(&fc->lock);
21946
parent = dget_parent(entry);
21947
fuse_lookup_init(req, parent->d_inode, entry, &outarg);
21948
@@ -277,7 +264,9 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
21949
return ERR_PTR(PTR_ERR(forget_req));
21952
- attr_version = fuse_get_attr_version(fc);
21953
+ spin_lock(&fc->lock);
21954
+ attr_version = fc->attr_version;
21955
+ spin_unlock(&fc->lock);
21957
fuse_lookup_init(req, dir, entry, &outarg);
21958
request_send(fc, req);
21959
@@ -657,9 +646,6 @@ static int fuse_rename(struct inode *olddir, struct dentry *oldent,
21960
err = req->out.h.error;
21961
fuse_put_request(fc, req);
21963
- /* ctime changes */
21964
- fuse_invalidate_attr(oldent->d_inode);
21966
fuse_invalidate_attr(olddir);
21967
if (olddir != newdir)
21968
fuse_invalidate_attr(newdir);
21969
@@ -747,7 +733,9 @@ static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
21971
return PTR_ERR(req);
21973
- attr_version = fuse_get_attr_version(fc);
21974
+ spin_lock(&fc->lock);
21975
+ attr_version = fc->attr_version;
21976
+ spin_unlock(&fc->lock);
21978
memset(&inarg, 0, sizeof(inarg));
21979
memset(&outarg, 0, sizeof(outarg));
21980
@@ -787,31 +775,6 @@ static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
21984
-int fuse_update_attributes(struct inode *inode, struct kstat *stat,
21985
- struct file *file, bool *refreshed)
21987
- struct fuse_inode *fi = get_fuse_inode(inode);
21991
- if (fi->i_time < get_jiffies_64()) {
21993
- err = fuse_do_getattr(inode, stat, file);
21998
- generic_fillattr(inode, stat);
21999
- stat->mode = fi->orig_i_mode;
22003
- if (refreshed != NULL)
22010
* Calling into a user-controlled filesystem gives the filesystem
22011
* daemon ptrace-like capabilities over the requester process. This
22012
@@ -899,9 +862,14 @@ static int fuse_permission(struct inode *inode, int mask, struct nameidata *nd)
22014
if ((fc->flags & FUSE_DEFAULT_PERMISSIONS) ||
22015
((mask & MAY_EXEC) && S_ISREG(inode->i_mode))) {
22016
- err = fuse_update_attributes(inode, NULL, NULL, &refreshed);
22019
+ struct fuse_inode *fi = get_fuse_inode(inode);
22020
+ if (fi->i_time < get_jiffies_64()) {
22021
+ err = fuse_do_getattr(inode, NULL, NULL);
22025
+ refreshed = true;
22029
if (fc->flags & FUSE_DEFAULT_PERMISSIONS) {
22030
@@ -967,6 +935,7 @@ static int fuse_readdir(struct file *file, void *dstbuf, filldir_t filldir)
22032
struct inode *inode = file->f_path.dentry->d_inode;
22033
struct fuse_conn *fc = get_fuse_conn(inode);
22034
+ struct fuse_file *ff = file->private_data;
22035
struct fuse_req *req;
22037
if (is_bad_inode(inode))
22038
@@ -983,7 +952,7 @@ static int fuse_readdir(struct file *file, void *dstbuf, filldir_t filldir)
22040
req->num_pages = 1;
22041
req->pages[0] = page;
22042
- fuse_read_fill(req, file, inode, file->f_pos, PAGE_SIZE, FUSE_READDIR);
22043
+ fuse_read_fill(req, ff, inode, file->f_pos, PAGE_SIZE, FUSE_READDIR);
22044
request_send(fc, req);
22045
nbytes = req->out.args[0].size;
22046
err = req->out.h.error;
22047
@@ -1204,12 +1173,22 @@ static int fuse_getattr(struct vfsmount *mnt, struct dentry *entry,
22048
struct kstat *stat)
22050
struct inode *inode = entry->d_inode;
22051
+ struct fuse_inode *fi = get_fuse_inode(inode);
22052
struct fuse_conn *fc = get_fuse_conn(inode);
22055
if (!fuse_allow_task(fc, current))
22058
- return fuse_update_attributes(inode, stat, NULL, NULL);
22059
+ if (fi->i_time < get_jiffies_64())
22060
+ err = fuse_do_getattr(inode, stat, NULL);
22063
+ generic_fillattr(inode, stat);
22064
+ stat->mode = fi->orig_i_mode;
22070
static int fuse_setxattr(struct dentry *entry, const char *name,
22071
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
22072
index bb05d22..535b373 100644
22073
--- a/fs/fuse/file.c
22074
+++ b/fs/fuse/file.c
22075
@@ -289,16 +289,14 @@ static int fuse_fsync(struct file *file, struct dentry *de, int datasync)
22076
return fuse_fsync_common(file, de, datasync, 0);
22079
-void fuse_read_fill(struct fuse_req *req, struct file *file,
22080
+void fuse_read_fill(struct fuse_req *req, struct fuse_file *ff,
22081
struct inode *inode, loff_t pos, size_t count, int opcode)
22083
struct fuse_read_in *inarg = &req->misc.read_in;
22084
- struct fuse_file *ff = file->private_data;
22086
inarg->fh = ff->fh;
22087
inarg->offset = pos;
22088
inarg->size = count;
22089
- inarg->flags = file->f_flags;
22090
req->in.h.opcode = opcode;
22091
req->in.h.nodeid = get_node_id(inode);
22092
req->in.numargs = 1;
22093
@@ -315,8 +313,9 @@ static size_t fuse_send_read(struct fuse_req *req, struct file *file,
22096
struct fuse_conn *fc = get_fuse_conn(inode);
22097
+ struct fuse_file *ff = file->private_data;
22099
- fuse_read_fill(req, file, inode, pos, count, FUSE_READ);
22100
+ fuse_read_fill(req, ff, inode, pos, count, FUSE_READ);
22101
if (owner != NULL) {
22102
struct fuse_read_in *inarg = &req->misc.read_in;
22104
@@ -377,16 +376,15 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
22105
fuse_put_request(fc, req);
22108
-static void fuse_send_readpages(struct fuse_req *req, struct file *file,
22109
+static void fuse_send_readpages(struct fuse_req *req, struct fuse_file *ff,
22110
struct inode *inode)
22112
struct fuse_conn *fc = get_fuse_conn(inode);
22113
loff_t pos = page_offset(req->pages[0]);
22114
size_t count = req->num_pages << PAGE_CACHE_SHIFT;
22115
req->out.page_zeroing = 1;
22116
- fuse_read_fill(req, file, inode, pos, count, FUSE_READ);
22117
+ fuse_read_fill(req, ff, inode, pos, count, FUSE_READ);
22118
if (fc->async_read) {
22119
- struct fuse_file *ff = file->private_data;
22120
req->ff = fuse_file_get(ff);
22121
req->end = fuse_readpages_end;
22122
request_send_background(fc, req);
22123
@@ -398,7 +396,7 @@ static void fuse_send_readpages(struct fuse_req *req, struct file *file,
22125
struct fuse_fill_data {
22126
struct fuse_req *req;
22127
- struct file *file;
22128
+ struct fuse_file *ff;
22129
struct inode *inode;
22132
@@ -413,7 +411,7 @@ static int fuse_readpages_fill(void *_data, struct page *page)
22133
(req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
22134
(req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_read ||
22135
req->pages[req->num_pages - 1]->index + 1 != page->index)) {
22136
- fuse_send_readpages(req, data->file, inode);
22137
+ fuse_send_readpages(req, data->ff, inode);
22138
data->req = req = fuse_get_req(fc);
22141
@@ -437,7 +435,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
22142
if (is_bad_inode(inode))
22145
- data.file = file;
22146
+ data.ff = file->private_data;
22147
data.inode = inode;
22148
data.req = fuse_get_req(fc);
22149
err = PTR_ERR(data.req);
22150
@@ -447,7 +445,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
22151
err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
22153
if (data.req->num_pages)
22154
- fuse_send_readpages(data.req, file, inode);
22155
+ fuse_send_readpages(data.req, data.ff, inode);
22157
fuse_put_request(fc, data.req);
22159
@@ -455,31 +453,11 @@ out:
22163
-static ssize_t fuse_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
22164
- unsigned long nr_segs, loff_t pos)
22166
- struct inode *inode = iocb->ki_filp->f_mapping->host;
22168
- if (pos + iov_length(iov, nr_segs) > i_size_read(inode)) {
22171
- * If trying to read past EOF, make sure the i_size
22172
- * attribute is up-to-date.
22174
- err = fuse_update_attributes(inode, NULL, iocb->ki_filp, NULL);
22179
- return generic_file_aio_read(iocb, iov, nr_segs, pos);
22182
-static void fuse_write_fill(struct fuse_req *req, struct file *file,
22183
+static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
22184
struct inode *inode, loff_t pos, size_t count,
22187
struct fuse_conn *fc = get_fuse_conn(inode);
22188
- struct fuse_file *ff = file->private_data;
22189
struct fuse_write_in *inarg = &req->misc.write.in;
22190
struct fuse_write_out *outarg = &req->misc.write.out;
22192
@@ -488,7 +466,6 @@ static void fuse_write_fill(struct fuse_req *req, struct file *file,
22193
inarg->offset = pos;
22194
inarg->size = count;
22195
inarg->write_flags = writepage ? FUSE_WRITE_CACHE : 0;
22196
- inarg->flags = file->f_flags;
22197
req->in.h.opcode = FUSE_WRITE;
22198
req->in.h.nodeid = get_node_id(inode);
22199
req->in.argpages = 1;
22200
@@ -509,7 +486,7 @@ static size_t fuse_send_write(struct fuse_req *req, struct file *file,
22203
struct fuse_conn *fc = get_fuse_conn(inode);
22204
- fuse_write_fill(req, file, inode, pos, count, 0);
22205
+ fuse_write_fill(req, file->private_data, inode, pos, count, 0);
22206
if (owner != NULL) {
22207
struct fuse_write_in *inarg = &req->misc.write.in;
22208
inarg->write_flags |= FUSE_WRITE_LOCKOWNER;
22209
@@ -910,7 +887,7 @@ static sector_t fuse_bmap(struct address_space *mapping, sector_t block)
22210
static const struct file_operations fuse_file_operations = {
22211
.llseek = generic_file_llseek,
22212
.read = do_sync_read,
22213
- .aio_read = fuse_file_aio_read,
22214
+ .aio_read = generic_file_aio_read,
22215
.write = do_sync_write,
22216
.aio_write = generic_file_aio_write,
22217
.mmap = fuse_file_mmap,
22218
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
22219
index 3ab8a30..6c5461d 100644
22220
--- a/fs/fuse/fuse_i.h
22221
+++ b/fs/fuse/fuse_i.h
22222
@@ -447,7 +447,7 @@ void fuse_send_forget(struct fuse_conn *fc, struct fuse_req *req,
22224
* Initialize READ or READDIR request
22226
-void fuse_read_fill(struct fuse_req *req, struct file *file,
22227
+void fuse_read_fill(struct fuse_req *req, struct fuse_file *ff,
22228
struct inode *inode, loff_t pos, size_t count, int opcode);
22231
@@ -593,6 +593,3 @@ int fuse_valid_type(int m);
22232
int fuse_allow_task(struct fuse_conn *fc, struct task_struct *task);
22234
u64 fuse_lock_owner_id(struct fuse_conn *fc, fl_owner_t id);
22236
-int fuse_update_attributes(struct inode *inode, struct kstat *stat,
22237
- struct file *file, bool *refreshed);
22238
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
22239
index 84f9f7d..9a68d69 100644
22240
--- a/fs/fuse/inode.c
22241
+++ b/fs/fuse/inode.c
22242
@@ -56,7 +56,6 @@ static struct inode *fuse_alloc_inode(struct super_block *sb)
22246
- fi->attr_version = 0;
22247
INIT_LIST_HEAD(&fi->write_files);
22248
fi->forget_req = fuse_request_alloc();
22249
if (!fi->forget_req) {
22250
@@ -563,7 +562,8 @@ static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req)
22251
arg->major = FUSE_KERNEL_VERSION;
22252
arg->minor = FUSE_KERNEL_MINOR_VERSION;
22253
arg->max_readahead = fc->bdi.ra_pages * PAGE_CACHE_SIZE;
22254
- arg->flags |= FUSE_ASYNC_READ | FUSE_POSIX_LOCKS | FUSE_ATOMIC_O_TRUNC;
22255
+ arg->flags |= FUSE_ASYNC_READ | FUSE_POSIX_LOCKS | FUSE_FILE_OPS |
22256
+ FUSE_ATOMIC_O_TRUNC;
22257
req->in.h.opcode = FUSE_INIT;
22258
req->in.numargs = 1;
22259
req->in.args[0].size = sizeof(*arg);
22260
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
22261
index 56f7790..556e34c 100644
22262
--- a/fs/ocfs2/aops.c
22263
+++ b/fs/ocfs2/aops.c
22264
@@ -1514,7 +1514,7 @@ int ocfs2_size_fits_inline_data(struct buffer_head *di_bh, u64 new_size)
22266
struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
22268
- if (new_size <= le16_to_cpu(di->id2.i_data.id_count))
22269
+ if (new_size < le16_to_cpu(di->id2.i_data.id_count))
22273
diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
22274
index 597e064..cd04606 100644
22275
--- a/fs/ocfs2/cluster/masklog.h
22276
+++ b/fs/ocfs2/cluster/masklog.h
22277
@@ -212,7 +212,7 @@ extern struct mlog_bits mlog_and_bits, mlog_not_bits;
22278
#define mlog_errno(st) do { \
22280
if (_st != -ERESTARTSYS && _st != -EINTR && \
22281
- _st != AOP_TRUNCATED_PAGE && _st != -ENOSPC) \
22282
+ _st != AOP_TRUNCATED_PAGE) \
22283
mlog(ML_ERROR, "status = %lld\n", (long long)_st); \
22286
diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c
22287
index 9923278..1957a5e 100644
22288
--- a/fs/ocfs2/dcache.c
22289
+++ b/fs/ocfs2/dcache.c
22290
@@ -344,24 +344,12 @@ static void ocfs2_dentry_iput(struct dentry *dentry, struct inode *inode)
22292
struct ocfs2_dentry_lock *dl = dentry->d_fsdata;
22296
- * No dentry lock is ok if we're disconnected or
22299
- if (!(dentry->d_flags & DCACHE_DISCONNECTED) &&
22300
- !d_unhashed(dentry)) {
22301
- unsigned long long ino = 0ULL;
22303
- ino = (unsigned long long)OCFS2_I(inode)->ip_blkno;
22304
- mlog(ML_ERROR, "Dentry is missing cluster lock. "
22305
- "inode: %llu, d_flags: 0x%x, d_name: %.*s\n",
22306
- ino, dentry->d_flags, dentry->d_name.len,
22307
- dentry->d_name.name);
22309
+ mlog_bug_on_msg(!dl && !(dentry->d_flags & DCACHE_DISCONNECTED),
22310
+ "dentry: %.*s\n", dentry->d_name.len,
22311
+ dentry->d_name.name);
22317
mlog_bug_on_msg(dl->dl_count == 0, "dentry: %.*s, count: %u\n",
22318
dentry->d_name.len, dentry->d_name.name,
22319
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
22320
index a54d33d..62e4a7d 100644
22321
--- a/fs/ocfs2/dlm/dlmmaster.c
22322
+++ b/fs/ocfs2/dlm/dlmmaster.c
22323
@@ -908,7 +908,7 @@ lookup:
22324
* but they might own this lockres. wait on them. */
22325
bit = find_next_bit(dlm->recovery_map, O2NM_MAX_NODES, 0);
22326
if (bit < O2NM_MAX_NODES) {
22327
- mlog(ML_NOTICE, "%s:%.*s: at least one node (%d) to "
22328
+ mlog(ML_NOTICE, "%s:%.*s: at least one node (%d) to"
22329
"recover before lock mastery can begin\n",
22330
dlm->name, namelen, (char *)lockid, bit);
22331
wait_on_recovery = 1;
22332
@@ -962,7 +962,7 @@ redo_request:
22333
spin_lock(&dlm->spinlock);
22334
bit = find_next_bit(dlm->recovery_map, O2NM_MAX_NODES, 0);
22335
if (bit < O2NM_MAX_NODES) {
22336
- mlog(ML_NOTICE, "%s:%.*s: at least one node (%d) to "
22337
+ mlog(ML_NOTICE, "%s:%.*s: at least one node (%d) to"
22338
"recover before lock mastery can begin\n",
22339
dlm->name, namelen, (char *)lockid, bit);
22340
wait_on_recovery = 1;
22341
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
22342
index b75b2e1..bbac7cd 100644
22343
--- a/fs/ocfs2/file.c
22344
+++ b/fs/ocfs2/file.c
22345
@@ -399,7 +399,7 @@ static int ocfs2_truncate_file(struct inode *inode,
22347
if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
22348
status = ocfs2_truncate_inline(inode, di_bh, new_i_size,
22349
- i_size_read(inode), 1);
22350
+ i_size_read(inode), 0);
22352
mlog_errno(status);
22354
@@ -1521,7 +1521,6 @@ static int ocfs2_remove_inode_range(struct inode *inode,
22355
u32 trunc_start, trunc_len, cpos, phys_cpos, alloc_size;
22356
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
22357
struct ocfs2_cached_dealloc_ctxt dealloc;
22358
- struct address_space *mapping = inode->i_mapping;
22360
ocfs2_init_dealloc_ctxt(&dealloc);
22362
@@ -1530,20 +1529,10 @@ static int ocfs2_remove_inode_range(struct inode *inode,
22364
if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
22365
ret = ocfs2_truncate_inline(inode, di_bh, byte_start,
22366
- byte_start + byte_len, 0);
22368
+ byte_start + byte_len, 1);
22374
- * There's no need to get fancy with the page cache
22375
- * truncate of an inline-data inode. We're talking
22376
- * about less than a page here, which will be cached
22377
- * in the dinode buffer anyway.
22379
- unmap_mapping_range(mapping, 0, 0, 0);
22380
- truncate_inode_pages(mapping, 0);
22385
trunc_start = ocfs2_clusters_for_bytes(osb->sb, byte_start);
22386
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
22387
index ebb2bbe..1d5e0cb 100644
22388
--- a/fs/ocfs2/inode.c
22389
+++ b/fs/ocfs2/inode.c
22390
@@ -455,8 +455,8 @@ static int ocfs2_read_locked_inode(struct inode *inode,
22392
fe = (struct ocfs2_dinode *) bh->b_data;
22393
if (!OCFS2_IS_VALID_DINODE(fe)) {
22394
- mlog(0, "Invalid dinode #%llu: signature = %.*s\n",
22395
- (unsigned long long)args->fi_blkno, 7,
22396
+ mlog(ML_ERROR, "Invalid dinode #%llu: signature = %.*s\n",
22397
+ (unsigned long long)le64_to_cpu(fe->i_blkno), 7,
22401
@@ -863,7 +863,7 @@ static int ocfs2_query_inode_wipe(struct inode *inode,
22402
status = ocfs2_try_open_lock(inode, 1);
22403
if (status == -EAGAIN) {
22405
- mlog(0, "Skipping delete of %llu because it is in use on "
22406
+ mlog(0, "Skipping delete of %llu because it is in use on"
22407
"other nodes\n", (unsigned long long)oi->ip_blkno);
22410
diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
22411
index 58ea88b..d272847 100644
22412
--- a/fs/ocfs2/localalloc.c
22413
+++ b/fs/ocfs2/localalloc.c
22414
@@ -484,7 +484,6 @@ int ocfs2_reserve_local_alloc_bits(struct ocfs2_super *osb,
22416
alloc = (struct ocfs2_dinode *) osb->local_alloc_bh->b_data;
22418
-#ifdef OCFS2_DEBUG_FS
22419
if (le32_to_cpu(alloc->id1.bitmap1.i_used) !=
22420
ocfs2_local_alloc_count_bits(alloc)) {
22421
ocfs2_error(osb->sb, "local alloc inode %llu says it has "
22422
@@ -495,7 +494,6 @@ int ocfs2_reserve_local_alloc_bits(struct ocfs2_super *osb,
22428
free_bits = le32_to_cpu(alloc->id1.bitmap1.i_total) -
22429
le32_to_cpu(alloc->id1.bitmap1.i_used);
22430
@@ -714,8 +712,9 @@ static int ocfs2_sync_local_to_main(struct ocfs2_super *osb,
22432
struct ocfs2_local_alloc *la = OCFS2_LOCAL_ALLOC(alloc);
22434
- mlog_entry("total = %u, used = %u\n",
22435
+ mlog_entry("total = %u, COUNT = %u, used = %u\n",
22436
le32_to_cpu(alloc->id1.bitmap1.i_total),
22437
+ ocfs2_local_alloc_count_bits(alloc),
22438
le32_to_cpu(alloc->id1.bitmap1.i_used));
22440
if (!alloc->id1.bitmap1.i_total) {
22441
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
22442
index 5ee7754..be562ac 100644
22443
--- a/fs/ocfs2/super.c
22444
+++ b/fs/ocfs2/super.c
22445
@@ -438,14 +438,14 @@ unlock_osb:
22449
+ if (!ocfs2_is_hard_readonly(osb))
22450
+ ocfs2_set_journal_params(osb);
22452
/* Only save off the new mount options in case of a successful
22454
osb->s_mount_opt = parsed_options.mount_opt;
22455
osb->s_atime_quantum = parsed_options.atime_quantum;
22456
osb->preferred_slot = parsed_options.slot;
22458
- if (!ocfs2_is_hard_readonly(osb))
22459
- ocfs2_set_journal_params(osb);
22463
diff --git a/fs/proc/base.c b/fs/proc/base.c
22464
index 02a63ac..a17c268 100644
22465
--- a/fs/proc/base.c
22466
+++ b/fs/proc/base.c
22467
@@ -2411,23 +2411,19 @@ out:
22468
* Find the first task with tgid >= tgid
22471
-struct tgid_iter {
22472
- unsigned int tgid;
22473
- struct task_struct *task;
22475
-static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter iter)
22476
+static struct task_struct *next_tgid(unsigned int tgid,
22477
+ struct pid_namespace *ns)
22479
+ struct task_struct *task;
22483
- put_task_struct(iter.task);
22486
- iter.task = NULL;
22487
- pid = find_ge_pid(iter.tgid, ns);
22489
+ pid = find_ge_pid(tgid, ns);
22491
- iter.tgid = pid_nr_ns(pid, ns);
22492
- iter.task = pid_task(pid, PIDTYPE_PID);
22493
+ tgid = pid_nr_ns(pid, ns) + 1;
22494
+ task = pid_task(pid, PIDTYPE_PID);
22495
/* What we to know is if the pid we have find is the
22496
* pid of a thread_group_leader. Testing for task
22497
* being a thread_group_leader is the obvious thing
22498
@@ -2440,25 +2436,23 @@ retry:
22499
* found doesn't happen to be a thread group leader.
22500
* As we don't care in the case of readdir.
22502
- if (!iter.task || !has_group_leader_pid(iter.task)) {
22504
+ if (!task || !has_group_leader_pid(task))
22507
- get_task_struct(iter.task);
22508
+ get_task_struct(task);
22515
#define TGID_OFFSET (FIRST_PROCESS_ENTRY + ARRAY_SIZE(proc_base_stuff))
22517
static int proc_pid_fill_cache(struct file *filp, void *dirent, filldir_t filldir,
22518
- struct tgid_iter iter)
22519
+ struct task_struct *task, int tgid)
22521
char name[PROC_NUMBUF];
22522
- int len = snprintf(name, sizeof(name), "%d", iter.tgid);
22523
+ int len = snprintf(name, sizeof(name), "%d", tgid);
22524
return proc_fill_cache(filp, dirent, filldir, name, len,
22525
- proc_pid_instantiate, iter.task, NULL);
22526
+ proc_pid_instantiate, task, NULL);
22529
/* for the /proc/ directory itself, after non-process stuff has been done */
22530
@@ -2466,7 +2460,8 @@ int proc_pid_readdir(struct file * filp, void * dirent, filldir_t filldir)
22532
unsigned int nr = filp->f_pos - FIRST_PROCESS_ENTRY;
22533
struct task_struct *reaper = get_proc_task(filp->f_path.dentry->d_inode);
22534
- struct tgid_iter iter;
22535
+ struct task_struct *task;
22537
struct pid_namespace *ns;
22540
@@ -2479,14 +2474,14 @@ int proc_pid_readdir(struct file * filp, void * dirent, filldir_t filldir)
22543
ns = filp->f_dentry->d_sb->s_fs_info;
22544
- iter.task = NULL;
22545
- iter.tgid = filp->f_pos - TGID_OFFSET;
22546
- for (iter = next_tgid(ns, iter);
22548
- iter.tgid += 1, iter = next_tgid(ns, iter)) {
22549
- filp->f_pos = iter.tgid + TGID_OFFSET;
22550
- if (proc_pid_fill_cache(filp, dirent, filldir, iter) < 0) {
22551
- put_task_struct(iter.task);
22552
+ tgid = filp->f_pos - TGID_OFFSET;
22553
+ for (task = next_tgid(tgid, ns);
22555
+ put_task_struct(task), task = next_tgid(tgid + 1, ns)) {
22556
+ tgid = task_pid_nr_ns(task, ns);
22557
+ filp->f_pos = tgid + TGID_OFFSET;
22558
+ if (proc_pid_fill_cache(filp, dirent, filldir, task, tgid) < 0) {
22559
+ put_task_struct(task);
22563
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
22564
index 39f3d65..a9806bc 100644
22565
--- a/fs/proc/generic.c
22566
+++ b/fs/proc/generic.c
22567
@@ -555,6 +555,41 @@ static int proc_register(struct proc_dir_entry * dir, struct proc_dir_entry * dp
22572
+ * Kill an inode that got unregistered..
22574
+static void proc_kill_inodes(struct proc_dir_entry *de)
22576
+ struct list_head *p;
22577
+ struct super_block *sb;
22580
+ * Actually it's a partial revoke().
22582
+ spin_lock(&sb_lock);
22583
+ list_for_each_entry(sb, &proc_fs_type.fs_supers, s_instances) {
22584
+ file_list_lock();
22585
+ list_for_each(p, &sb->s_files) {
22586
+ struct file *filp = list_entry(p, struct file,
22588
+ struct dentry *dentry = filp->f_path.dentry;
22589
+ struct inode *inode;
22590
+ const struct file_operations *fops;
22592
+ if (dentry->d_op != &proc_dentry_operations)
22594
+ inode = dentry->d_inode;
22595
+ if (PDE(inode) != de)
22597
+ fops = filp->f_op;
22598
+ filp->f_op = NULL;
22601
+ file_list_unlock();
22603
+ spin_unlock(&sb_lock);
22606
static struct proc_dir_entry *proc_create(struct proc_dir_entry **parent,
22609
@@ -729,6 +764,8 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
22611
if (S_ISDIR(de->mode))
22613
+ if (!S_ISREG(de->mode))
22614
+ proc_kill_inodes(de);
22616
WARN_ON(de->subdir);
22617
if (!atomic_read(&de->count))
22618
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
22619
index 1820eb2..1b2b6c6 100644
22620
--- a/fs/proc/internal.h
22621
+++ b/fs/proc/internal.h
22622
@@ -78,3 +78,5 @@ static inline int proc_fd(struct inode *inode)
22624
return PROC_I(inode)->fd;
22627
+extern struct file_system_type proc_fs_type;
22628
diff --git a/fs/proc/root.c b/fs/proc/root.c
22629
index ec9cb3b..1f86bb8 100644
22630
--- a/fs/proc/root.c
22631
+++ b/fs/proc/root.c
22632
@@ -98,7 +98,7 @@ static void proc_kill_sb(struct super_block *sb)
22636
-static struct file_system_type proc_fs_type = {
22637
+struct file_system_type proc_fs_type = {
22639
.get_sb = proc_get_sb,
22640
.kill_sb = proc_kill_sb,
22641
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
22642
index 4045bdc..27d1785 100644
22643
--- a/fs/sysfs/file.c
22644
+++ b/fs/sysfs/file.c
22645
@@ -119,11 +119,7 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
22647
sysfs_put_active_two(attr_sd);
22650
- * The code works fine with PAGE_SIZE return but it's likely to
22651
- * indicate truncated result or overflow in normal use cases.
22653
- BUG_ON(count >= (ssize_t)PAGE_SIZE);
22654
+ BUG_ON(count > (ssize_t)PAGE_SIZE);
22656
buffer->needs_read_fill = 0;
22657
buffer->count = count;
22658
diff --git a/include/asm-arm/arch-at91/board.h b/include/asm-arm/arch-at91/board.h
22659
index 7905496..c0d7075 100644
22660
--- a/include/asm-arm/arch-at91/board.h
22661
+++ b/include/asm-arm/arch-at91/board.h
22664
#include <linux/mtd/partitions.h>
22665
#include <linux/device.h>
22666
-#include <linux/i2c.h>
22667
#include <linux/spi/spi.h>
22670
@@ -95,7 +94,7 @@ struct at91_nand_data {
22671
extern void __init at91_add_device_nand(struct at91_nand_data *data);
22674
-extern void __init at91_add_device_i2c(struct i2c_board_info *devices, int nr_devices);
22675
+extern void __init at91_add_device_i2c(void);
22678
extern void __init at91_add_device_spi(struct spi_board_info *devices, int nr_devices);
22679
diff --git a/include/asm-arm/arch-ixp23xx/irqs.h b/include/asm-arm/arch-ixp23xx/irqs.h
22680
index 27c5808..e696395 100644
22681
--- a/include/asm-arm/arch-ixp23xx/irqs.h
22682
+++ b/include/asm-arm/arch-ixp23xx/irqs.h
22683
@@ -153,7 +153,7 @@
22685
#define NR_IXP23XX_MACH_IRQS 32
22687
-#define NR_IRQS (NR_IXP23XX_IRQS + NR_IXP23XX_MACH_IRQS)
22688
+#define NR_IRQS NR_IXP23XX_IRQS + NR_IXP23XX_MACH_IRQS
22690
#define IXP23XX_MACH_IRQ(irq) (NR_IXP23XX_IRQ + (irq))
22692
diff --git a/include/asm-arm/arch-omap/board-innovator.h b/include/asm-arm/arch-omap/board-innovator.h
22693
index 56d2c98..b3cf334 100644
22694
--- a/include/asm-arm/arch-omap/board-innovator.h
22695
+++ b/include/asm-arm/arch-omap/board-innovator.h
22697
#define OMAP1510P1_EMIFF_PRI_VALUE 0x00
22699
#define NR_FPGA_IRQS 24
22700
-#define NR_IRQS (IH_BOARD_BASE + NR_FPGA_IRQS)
22701
+#define NR_IRQS IH_BOARD_BASE + NR_FPGA_IRQS
22703
#ifndef __ASSEMBLY__
22704
void fpga_write(unsigned char val, int reg);
22705
diff --git a/include/asm-arm/arch-pxa/irqs.h b/include/asm-arm/arch-pxa/irqs.h
22706
index b76ee6d..6238dbf 100644
22707
--- a/include/asm-arm/arch-pxa/irqs.h
22708
+++ b/include/asm-arm/arch-pxa/irqs.h
22711
#define PXA_IRQ(x) (x)
22713
-#if defined(CONFIG_PXA27x) || defined(CONFIG_PXA3xx)
22714
+#ifdef CONFIG_PXA27x
22715
#define IRQ_SSP3 PXA_IRQ(0) /* SSP3 service request */
22716
#define IRQ_MSL PXA_IRQ(1) /* MSL Interface interrupt */
22717
#define IRQ_USBH2 PXA_IRQ(2) /* USB Host interrupt 1 (OHCI) */
22718
@@ -52,27 +52,11 @@
22719
#define IRQ_RTC1Hz PXA_IRQ(30) /* RTC HZ Clock Tick */
22720
#define IRQ_RTCAlrm PXA_IRQ(31) /* RTC Alarm */
22722
-#if defined(CONFIG_PXA27x) || defined(CONFIG_PXA3xx)
22723
+#ifdef CONFIG_PXA27x
22724
#define IRQ_TPM PXA_IRQ(32) /* TPM interrupt */
22725
#define IRQ_CAMERA PXA_IRQ(33) /* Camera Interface */
22728
-#ifdef CONFIG_PXA3xx
22729
-#define IRQ_SSP4 PXA_IRQ(13) /* SSP4 service request */
22730
-#define IRQ_CIR PXA_IRQ(34) /* Consumer IR */
22731
-#define IRQ_TSI PXA_IRQ(36) /* Touch Screen Interface (PXA320) */
22732
-#define IRQ_USIM2 PXA_IRQ(38) /* USIM2 Controller */
22733
-#define IRQ_GRPHICS PXA_IRQ(39) /* Graphics Controller */
22734
-#define IRQ_MMC2 PXA_IRQ(41) /* MMC2 Controller */
22735
-#define IRQ_1WIRE PXA_IRQ(44) /* 1-Wire Controller */
22736
-#define IRQ_NAND PXA_IRQ(45) /* NAND Controller */
22737
-#define IRQ_USB2 PXA_IRQ(46) /* USB 2.0 Device Controller */
22738
-#define IRQ_WAKEUP0 PXA_IRQ(49) /* EXT_WAKEUP0 */
22739
-#define IRQ_WAKEUP1 PXA_IRQ(50) /* EXT_WAKEUP1 */
22740
-#define IRQ_DMEMC PXA_IRQ(51) /* Dynamic Memory Controller */
22741
-#define IRQ_MMC3 PXA_IRQ(55) /* MMC3 Controller (PXA310) */
22744
#define PXA_GPIO_IRQ_BASE (64)
22745
#define PXA_GPIO_IRQ_NUM (128)
22747
diff --git a/include/asm-arm/arch-pxa/mfp-pxa300.h b/include/asm-arm/arch-pxa/mfp-pxa300.h
22748
index a209966..822a27c 100644
22749
--- a/include/asm-arm/arch-pxa/mfp-pxa300.h
22750
+++ b/include/asm-arm/arch-pxa/mfp-pxa300.h
22751
@@ -179,7 +179,7 @@
22752
#define GPIO62_LCD_CS_N MFP_CFG_DRV(GPIO62, AF2, DS01X)
22753
#define GPIO72_LCD_FCLK MFP_CFG_DRV(GPIO72, AF1, DS01X)
22754
#define GPIO73_LCD_LCLK MFP_CFG_DRV(GPIO73, AF1, DS01X)
22755
-#define GPIO74_LCD_PCLK MFP_CFG_DRV(GPIO74, AF1, DS02X)
22756
+#define GPIO74_LCD_PCLK MFP_CFG_DRV(GPIO74, AF1, DS01X)
22757
#define GPIO75_LCD_BIAS MFP_CFG_DRV(GPIO75, AF1, DS01X)
22758
#define GPIO76_LCD_VSYNC MFP_CFG_DRV(GPIO76, AF2, DS01X)
22760
diff --git a/include/asm-arm/arch-pxa/mfp-pxa320.h b/include/asm-arm/arch-pxa/mfp-pxa320.h
22761
index 52deedc..488a5bb 100644
22762
--- a/include/asm-arm/arch-pxa/mfp-pxa320.h
22763
+++ b/include/asm-arm/arch-pxa/mfp-pxa320.h
22765
#include <asm/arch/mfp.h>
22768
-#define GPIO46_GPIO MFP_CFG(GPIO46, AF0)
22769
+#define GPIO46_GPIO MFP_CFG(GPIO6, AF0)
22770
#define GPIO49_GPIO MFP_CFG(GPIO49, AF0)
22771
#define GPIO50_GPIO MFP_CFG(GPIO50, AF0)
22772
#define GPIO51_GPIO MFP_CFG(GPIO51, AF0)
22773
diff --git a/include/asm-arm/arch-pxa/mfp.h b/include/asm-arm/arch-pxa/mfp.h
22774
index 03c508d..ac4157a 100644
22775
--- a/include/asm-arm/arch-pxa/mfp.h
22776
+++ b/include/asm-arm/arch-pxa/mfp.h
22777
@@ -346,31 +346,23 @@ typedef uint32_t mfp_cfg_t;
22778
#define MFP_CFG_PIN(mfp_cfg) (((mfp_cfg) >> 16) & 0xffff)
22779
#define MFP_CFG_VAL(mfp_cfg) ((mfp_cfg) & 0xffff)
22782
- * MFP register defaults to
22783
- * drive strength fast 3mA (010'b)
22784
- * edge detection logic disabled
22785
- * alternate function 0
22787
-#define MFPR_DEFAULT (0x0840)
22788
+#define MFPR_DEFAULT (0x0000)
22790
#define MFP_CFG(pin, af) \
22791
((MFP_PIN_##pin << 16) | MFPR_DEFAULT | (MFP_##af))
22793
#define MFP_CFG_DRV(pin, af, drv) \
22794
- ((MFP_PIN_##pin << 16) | (MFPR_DEFAULT & ~MFPR_DRV_MASK) |\
22795
+ ((MFP_PIN_##pin << 16) | MFPR_DEFAULT |\
22796
((MFP_##drv) << 10) | (MFP_##af))
22798
#define MFP_CFG_LPM(pin, af, lpm) \
22799
- ((MFP_PIN_##pin << 16) | (MFPR_DEFAULT & ~MFPR_LPM_MASK) |\
22800
+ ((MFP_PIN_##pin << 16) | MFPR_DEFAULT | (MFP_##af) |\
22801
(((MFP_LPM_##lpm) & 0x3) << 7) |\
22802
(((MFP_LPM_##lpm) & 0x4) << 12) |\
22803
- (((MFP_LPM_##lpm) & 0x8) << 10) |\
22805
+ (((MFP_LPM_##lpm) & 0x8) << 10))
22807
#define MFP_CFG_X(pin, af, drv, lpm) \
22808
- ((MFP_PIN_##pin << 16) |\
22809
- (MFPR_DEFAULT & ~(MFPR_DRV_MASK | MFPR_LPM_MASK)) |\
22810
+ ((MFP_PIN_##pin << 16) | MFPR_DEFAULT |\
22811
((MFP_##drv) << 10) | (MFP_##af) |\
22812
(((MFP_LPM_##lpm) & 0x3) << 7) |\
22813
(((MFP_LPM_##lpm) & 0x4) << 12) |\
22814
diff --git a/include/asm-arm/arch-pxa/pxa-regs.h b/include/asm-arm/arch-pxa/pxa-regs.h
22815
index 6b33df6..bb68b59 100644
22816
--- a/include/asm-arm/arch-pxa/pxa-regs.h
22817
+++ b/include/asm-arm/arch-pxa/pxa-regs.h
22818
@@ -110,10 +110,7 @@
22819
#define DALGN __REG(0x400000a0) /* DMA Alignment Register */
22820
#define DINT __REG(0x400000f0) /* DMA Interrupt Register */
22822
-#define DRCMR(n) (*(((n) < 64) ? \
22823
- &__REG2(0x40000100, ((n) & 0x3f) << 2) : \
22824
- &__REG2(0x40001100, ((n) & 0x3f) << 2)))
22826
+#define DRCMR(n) __REG2(0x40000100, (n)<<2)
22827
#define DRCMR0 __REG(0x40000100) /* Request to Channel Map Register for DREQ 0 */
22828
#define DRCMR1 __REG(0x40000104) /* Request to Channel Map Register for DREQ 1 */
22829
#define DRCMR2 __REG(0x40000108) /* Request to Channel Map Register for I2S receive Request */
22830
diff --git a/include/asm-arm/arch-s3c2410/spi-gpio.h b/include/asm-arm/arch-s3c2410/spi-gpio.h
22831
index ba1dca8..c1e4db7 100644
22832
--- a/include/asm-arm/arch-s3c2410/spi-gpio.h
22833
+++ b/include/asm-arm/arch-s3c2410/spi-gpio.h
22834
@@ -21,8 +21,6 @@ struct s3c2410_spigpio_info {
22835
unsigned long pin_mosi;
22836
unsigned long pin_miso;
22840
unsigned long board_size;
22841
struct spi_board_info *board_info;
22843
diff --git a/include/asm-m32r/thread_info.h b/include/asm-m32r/thread_info.h
22844
index 1effcd0..c039820 100644
22845
--- a/include/asm-m32r/thread_info.h
22846
+++ b/include/asm-m32r/thread_info.h
22847
@@ -149,21 +149,16 @@ static inline unsigned int get_thread_fault_code(void)
22848
#define TIF_NEED_RESCHED 2 /* rescheduling necessary */
22849
#define TIF_SINGLESTEP 3 /* restore singlestep on return to user mode */
22850
#define TIF_IRET 4 /* return with iret */
22851
-#define TIF_RESTORE_SIGMASK 8 /* restore signal mask in do_signal() */
22852
-#define TIF_USEDFPU 16 /* FPU was used by this task this quantum (SMP) */
22853
-#define TIF_POLLING_NRFLAG 17 /* true if poll_idle() is polling TIF_NEED_RESCHED */
22854
-#define TIF_MEMDIE 18 /* OOM killer killed process */
22855
-#define TIF_FREEZE 19 /* is freezing for suspend */
22856
+#define TIF_POLLING_NRFLAG 16 /* true if poll_idle() is polling TIF_NEED_RESCHED */
22857
+ /* 31..28 fault code */
22858
+#define TIF_MEMDIE 17
22860
#define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE)
22861
#define _TIF_SIGPENDING (1<<TIF_SIGPENDING)
22862
#define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
22863
#define _TIF_SINGLESTEP (1<<TIF_SINGLESTEP)
22864
#define _TIF_IRET (1<<TIF_IRET)
22865
-#define _TIF_RESTORE_SIGMASK (1<<TIF_RESTORE_SIGMASK)
22866
-#define _TIF_USEDFPU (1<<TIF_USEDFPU)
22867
#define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
22868
-#define _TIF_FREEZE (1<<TIF_FREEZE)
22870
#define _TIF_WORK_MASK 0x0000FFFE /* work to do on interrupt/exception return */
22871
#define _TIF_ALLWORK_MASK 0x0000FFFF /* work to do on any return to u-space */
22872
diff --git a/include/asm-m32r/unistd.h b/include/asm-m32r/unistd.h
22873
index f467eac..cbbd537 100644
22874
--- a/include/asm-m32r/unistd.h
22875
+++ b/include/asm-m32r/unistd.h
22876
@@ -290,50 +290,10 @@
22877
#define __NR_mq_getsetattr (__NR_mq_open+5)
22878
#define __NR_kexec_load 283
22879
#define __NR_waitid 284
22880
-/* 285 is unused */
22881
-#define __NR_add_key 286
22882
-#define __NR_request_key 287
22883
-#define __NR_keyctl 288
22884
-#define __NR_ioprio_set 289
22885
-#define __NR_ioprio_get 290
22886
-#define __NR_inotify_init 291
22887
-#define __NR_inotify_add_watch 292
22888
-#define __NR_inotify_rm_watch 293
22889
-#define __NR_migrate_pages 294
22890
-#define __NR_openat 295
22891
-#define __NR_mkdirat 296
22892
-#define __NR_mknodat 297
22893
-#define __NR_fchownat 298
22894
-#define __NR_futimesat 299
22895
-#define __NR_fstatat64 300
22896
-#define __NR_unlinkat 301
22897
-#define __NR_renameat 302
22898
-#define __NR_linkat 303
22899
-#define __NR_symlinkat 304
22900
-#define __NR_readlinkat 305
22901
-#define __NR_fchmodat 306
22902
-#define __NR_faccessat 307
22903
-#define __NR_pselect6 308
22904
-#define __NR_ppoll 309
22905
-#define __NR_unshare 310
22906
-#define __NR_set_robust_list 311
22907
-#define __NR_get_robust_list 312
22908
-#define __NR_splice 313
22909
-#define __NR_sync_file_range 314
22910
-#define __NR_tee 315
22911
-#define __NR_vmsplice 316
22912
-#define __NR_move_pages 317
22913
-#define __NR_getcpu 318
22914
-#define __NR_epoll_pwait 319
22915
-#define __NR_utimensat 320
22916
-#define __NR_signalfd 321
22917
-#define __NR_timerfd 322
22918
-#define __NR_eventfd 323
22919
-#define __NR_fallocate 324
22923
-#define NR_syscalls 325
22924
+#define NR_syscalls 285
22926
#define __ARCH_WANT_IPC_PARSE_VERSION
22927
#define __ARCH_WANT_STAT64
22928
@@ -351,30 +311,6 @@
22929
#define __ARCH_WANT_SYS_OLDUMOUNT
22930
#define __ARCH_WANT_SYS_RT_SIGACTION
22932
-#define __IGNORE_lchown
22933
-#define __IGNORE_setuid
22934
-#define __IGNORE_getuid
22935
-#define __IGNORE_setgid
22936
-#define __IGNORE_getgid
22937
-#define __IGNORE_geteuid
22938
-#define __IGNORE_getegid
22939
-#define __IGNORE_fcntl
22940
-#define __IGNORE_setreuid
22941
-#define __IGNORE_setregid
22942
-#define __IGNORE_getrlimit
22943
-#define __IGNORE_getgroups
22944
-#define __IGNORE_setgroups
22945
-#define __IGNORE_select
22946
-#define __IGNORE_mmap
22947
-#define __IGNORE_fchown
22948
-#define __IGNORE_setfsuid
22949
-#define __IGNORE_setfsgid
22950
-#define __IGNORE_setresuid
22951
-#define __IGNORE_getresuid
22952
-#define __IGNORE_setresgid
22953
-#define __IGNORE_getresgid
22954
-#define __IGNORE_chown
22957
* "Conditional" syscalls
22959
diff --git a/include/asm-mips/cpu-features.h b/include/asm-mips/cpu-features.h
22960
index 5ea701f..f6bd308 100644
22961
--- a/include/asm-mips/cpu-features.h
22962
+++ b/include/asm-mips/cpu-features.h
22963
@@ -207,13 +207,13 @@
22966
#ifndef cpu_dcache_line_size
22967
-#define cpu_dcache_line_size() cpu_data[0].dcache.linesz
22968
+#define cpu_dcache_line_size() current_cpu_data.dcache.linesz
22970
#ifndef cpu_icache_line_size
22971
-#define cpu_icache_line_size() cpu_data[0].icache.linesz
22972
+#define cpu_icache_line_size() current_cpu_data.icache.linesz
22974
#ifndef cpu_scache_line_size
22975
-#define cpu_scache_line_size() cpu_data[0].scache.linesz
22976
+#define cpu_scache_line_size() current_cpu_data.scache.linesz
22979
#endif /* __ASM_CPU_FEATURES_H */
22980
diff --git a/include/asm-mips/system.h b/include/asm-mips/system.h
22981
index a944eda..1030562 100644
22982
--- a/include/asm-mips/system.h
22983
+++ b/include/asm-mips/system.h
22984
@@ -209,6 +209,8 @@ extern void *set_except_vector(int n, void *addr);
22985
extern unsigned long ebase;
22986
extern void per_cpu_trap_init(void);
22988
+extern int stop_a_enabled;
22991
* See include/asm-ia64/system.h; prevents deadlock on SMP
22993
diff --git a/include/asm-x86/Kbuild b/include/asm-x86/Kbuild
22994
index 12db5a1..da5eb69 100644
22995
--- a/include/asm-x86/Kbuild
22996
+++ b/include/asm-x86/Kbuild
22997
@@ -3,6 +3,7 @@ include include/asm-generic/Kbuild.asm
22999
header-y += bootparam.h
23000
header-y += debugreg.h
23003
header-y += msr-index.h
23004
header-y += prctl.h
23005
diff --git a/include/asm-x86/kvm.h b/include/asm-x86/kvm.h
23006
new file mode 100644
23007
index 0000000..17afa81
23009
+++ b/include/asm-x86/kvm.h
23011
+#ifndef __LINUX_KVM_X86_H
23012
+#define __LINUX_KVM_X86_H
23015
+ * KVM x86 specific structures and definitions
23019
+#include <asm/types.h>
23020
+#include <linux/ioctl.h>
23022
+/* Architectural interrupt line count. */
23023
+#define KVM_NR_INTERRUPTS 256
23025
+struct kvm_memory_alias {
23026
+ __u32 slot; /* this has a different namespace than memory slots */
23028
+ __u64 guest_phys_addr;
23029
+ __u64 memory_size;
23030
+ __u64 target_phys_addr;
23033
+/* for KVM_GET_IRQCHIP and KVM_SET_IRQCHIP */
23034
+struct kvm_pic_state {
23035
+ __u8 last_irr; /* edge detection */
23036
+ __u8 irr; /* interrupt request register */
23037
+ __u8 imr; /* interrupt mask register */
23038
+ __u8 isr; /* interrupt service register */
23039
+ __u8 priority_add; /* highest irq priority */
23041
+ __u8 read_reg_select;
23043
+ __u8 special_mask;
23046
+ __u8 rotate_on_auto_eoi;
23047
+ __u8 special_fully_nested_mode;
23048
+ __u8 init4; /* true if 4 byte init */
23049
+ __u8 elcr; /* PIIX edge/trigger selection */
23053
+#define KVM_IOAPIC_NUM_PINS 24
23054
+struct kvm_ioapic_state {
23055
+ __u64 base_address;
23064
+ __u8 delivery_mode:3;
23065
+ __u8 dest_mode:1;
23066
+ __u8 delivery_status:1;
23068
+ __u8 remote_irr:1;
23069
+ __u8 trig_mode:1;
23072
+ __u8 reserved[4];
23075
+ } redirtbl[KVM_IOAPIC_NUM_PINS];
23078
+#define KVM_IRQCHIP_PIC_MASTER 0
23079
+#define KVM_IRQCHIP_PIC_SLAVE 1
23080
+#define KVM_IRQCHIP_IOAPIC 2
23082
+/* for KVM_GET_REGS and KVM_SET_REGS */
23084
+ /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
23085
+ __u64 rax, rbx, rcx, rdx;
23086
+ __u64 rsi, rdi, rsp, rbp;
23087
+ __u64 r8, r9, r10, r11;
23088
+ __u64 r12, r13, r14, r15;
23089
+ __u64 rip, rflags;
23092
+/* for KVM_GET_LAPIC and KVM_SET_LAPIC */
23093
+#define KVM_APIC_REG_SIZE 0x400
23094
+struct kvm_lapic_state {
23095
+ char regs[KVM_APIC_REG_SIZE];
23098
+struct kvm_segment {
23103
+ __u8 present, dpl, db, s, l, g, avl;
23108
+struct kvm_dtable {
23111
+ __u16 padding[3];
23115
+/* for KVM_GET_SREGS and KVM_SET_SREGS */
23116
+struct kvm_sregs {
23117
+ /* out (KVM_GET_SREGS) / in (KVM_SET_SREGS) */
23118
+ struct kvm_segment cs, ds, es, fs, gs, ss;
23119
+ struct kvm_segment tr, ldt;
23120
+ struct kvm_dtable gdt, idt;
23121
+ __u64 cr0, cr2, cr3, cr4, cr8;
23124
+ __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
23127
+struct kvm_msr_entry {
23133
+/* for KVM_GET_MSRS and KVM_SET_MSRS */
23135
+ __u32 nmsrs; /* number of msrs in entries */
23138
+ struct kvm_msr_entry entries[0];
23141
+/* for KVM_GET_MSR_INDEX_LIST */
23142
+struct kvm_msr_list {
23143
+ __u32 nmsrs; /* number of msrs in entries */
23144
+ __u32 indices[0];
23148
+struct kvm_cpuid_entry {
23157
+/* for KVM_SET_CPUID */
23158
+struct kvm_cpuid {
23161
+ struct kvm_cpuid_entry entries[0];
23164
+struct kvm_cpuid_entry2 {
23172
+ __u32 padding[3];
23175
+#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1
23176
+#define KVM_CPUID_FLAG_STATEFUL_FUNC 2
23177
+#define KVM_CPUID_FLAG_STATE_READ_NEXT 4
23179
+/* for KVM_SET_CPUID2 */
23180
+struct kvm_cpuid2 {
23183
+ struct kvm_cpuid_entry2 entries[0];
23187
diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
23188
new file mode 100644
23189
index 0000000..c6f3fd8
23191
+++ b/include/asm-x86/kvm_para.h
23193
+#ifndef __X86_KVM_PARA_H
23194
+#define __X86_KVM_PARA_H
23196
+/* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx. It
23197
+ * should be used to determine that a VM is running under KVM.
23199
+#define KVM_CPUID_SIGNATURE 0x40000000
23201
+/* This CPUID returns a feature bitmap in eax. Before enabling a particular
23202
+ * paravirtualization, the appropriate feature bit should be checked.
23204
+#define KVM_CPUID_FEATURES 0x40000001
23207
+#include <asm/processor.h>
23209
+/* This instruction is vmcall. On non-VT architectures, it will generate a
23210
+ * trap that we will then rewrite to the appropriate instruction.
23212
+#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"
23214
+/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun
23215
+ * instruction. The hypervisor may replace it with something else but only the
23216
+ * instructions are guaranteed to be supported.
23218
+ * Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
23219
+ * The hypercall number should be placed in rax and the return value will be
23220
+ * placed in rax. No other registers will be clobbered unless explicited
23221
+ * noted by the particular hypercall.
23224
+static inline long kvm_hypercall0(unsigned int nr)
23227
+ asm volatile(KVM_HYPERCALL
23233
+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
23236
+ asm volatile(KVM_HYPERCALL
23238
+ : "a"(nr), "b"(p1));
23242
+static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
23243
+ unsigned long p2)
23246
+ asm volatile(KVM_HYPERCALL
23248
+ : "a"(nr), "b"(p1), "c"(p2));
23252
+static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
23253
+ unsigned long p2, unsigned long p3)
23256
+ asm volatile(KVM_HYPERCALL
23258
+ : "a"(nr), "b"(p1), "c"(p2), "d"(p3));
23262
+static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
23263
+ unsigned long p2, unsigned long p3,
23264
+ unsigned long p4)
23267
+ asm volatile(KVM_HYPERCALL
23269
+ : "a"(nr), "b"(p1), "c"(p2), "d"(p3), "S"(p4));
23273
+static inline int kvm_para_available(void)
23275
+ unsigned int eax, ebx, ecx, edx;
23276
+ char signature[13];
23278
+ cpuid(KVM_CPUID_SIGNATURE, &eax, &ebx, &ecx, &edx);
23279
+ memcpy(signature + 0, &ebx, 4);
23280
+ memcpy(signature + 4, &ecx, 4);
23281
+ memcpy(signature + 8, &edx, 4);
23282
+ signature[12] = 0;
23284
+ if (strcmp(signature, "KVMKVMKVM") == 0)
23290
+static inline unsigned int kvm_arch_para_features(void)
23292
+ return cpuid_eax(KVM_CPUID_FEATURES);
23298
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
23299
index 37bfa19..397197f 100644
23300
--- a/include/linux/Kbuild
23301
+++ b/include/linux/Kbuild
23302
@@ -98,7 +98,6 @@ header-y += iso_fs.h
23303
header-y += ixjuser.h
23304
header-y += jffs2.h
23305
header-y += keyctl.h
23307
header-y += limits.h
23308
header-y += lock_dlm_plock.h
23309
header-y += magic.h
23310
@@ -255,6 +254,7 @@ unifdef-y += kd.h
23311
unifdef-y += kernelcapi.h
23312
unifdef-y += kernel.h
23313
unifdef-y += keyboard.h
23314
+unifdef-$(CONFIG_ARCH_SUPPORTS_KVM) += kvm.h
23316
unifdef-y += loop.h
23318
diff --git a/include/linux/ext2_fs.h b/include/linux/ext2_fs.h
23319
index 84cec2a..0f6c86c 100644
23320
--- a/include/linux/ext2_fs.h
23321
+++ b/include/linux/ext2_fs.h
23322
@@ -563,4 +563,11 @@ enum {
23324
#define EXT2_MAX_REC_LEN ((1<<16)-1)
23326
+static inline ext2_fsblk_t
23327
+ext2_group_first_block_no(struct super_block *sb, unsigned long group_no)
23329
+ return group_no * (ext2_fsblk_t)EXT2_BLOCKS_PER_GROUP(sb) +
23330
+ le32_to_cpu(EXT2_SB(sb)->s_es->s_first_data_block);
23333
#endif /* _LINUX_EXT2_FS_H */
23334
diff --git a/include/linux/fuse.h b/include/linux/fuse.h
23335
index 5c86f11..d0c4370 100644
23336
--- a/include/linux/fuse.h
23337
+++ b/include/linux/fuse.h
23339
* - add lk_flags in fuse_lk_in
23340
* - add lock_owner field to fuse_setattr_in, fuse_read_in and fuse_write_in
23341
* - add blksize field to fuse_attr
23342
- * - add file flags field to fuse_read_in and fuse_write_in
23345
#include <asm/types.h>
23346
@@ -281,8 +280,6 @@ struct fuse_read_in {
23354
#define FUSE_COMPAT_WRITE_IN_SIZE 24
23355
@@ -293,8 +290,6 @@ struct fuse_write_in {
23363
struct fuse_write_out {
23364
diff --git a/include/linux/input.h b/include/linux/input.h
23365
index 2075d6d..b45f240 100644
23366
--- a/include/linux/input.h
23367
+++ b/include/linux/input.h
23368
@@ -530,11 +530,6 @@ struct input_absinfo {
23369
#define KEY_DOLLAR 0x1b2
23370
#define KEY_EURO 0x1b3
23372
-#define KEY_FRAMEBACK 0x1b4 /* Consumer - transport controls */
23373
-#define KEY_FRAMEFORWARD 0x1b5
23375
-#define KEY_CONTEXT_MENU 0x1b6 /* GenDesc - system context menu */
23377
#define KEY_DEL_EOL 0x1c0
23378
#define KEY_DEL_EOS 0x1c1
23379
#define KEY_INS_LINE 0x1c2
23380
diff --git a/include/linux/kd.h b/include/linux/kd.h
23381
index 15f2853..c91fc0c 100644
23382
--- a/include/linux/kd.h
23383
+++ b/include/linux/kd.h
23384
@@ -126,7 +126,7 @@ struct kbdiacrs {
23385
#define KDSKBDIACR 0x4B4B /* write kernel accent table */
23388
- unsigned int diacr, base, result;
23389
+ __u32 diacr, base, result;
23391
struct kbdiacrsuc {
23392
unsigned int kb_cnt; /* number of entries in following array */
23393
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
23394
index 057a7f3..f0bebd6 100644
23395
--- a/include/linux/kvm.h
23396
+++ b/include/linux/kvm.h
23399
#include <asm/types.h>
23400
#include <linux/ioctl.h>
23401
+#include <asm/kvm.h>
23403
#define KVM_API_VERSION 12
23405
-/* Architectural interrupt line count. */
23406
-#define KVM_NR_INTERRUPTS 256
23408
/* for KVM_CREATE_MEMORY_REGION */
23409
struct kvm_memory_region {
23411
@@ -23,17 +21,19 @@ struct kvm_memory_region {
23412
__u64 memory_size; /* bytes */
23415
-/* for kvm_memory_region::flags */
23416
-#define KVM_MEM_LOG_DIRTY_PAGES 1UL
23418
-struct kvm_memory_alias {
23419
- __u32 slot; /* this has a different namespace than memory slots */
23420
+/* for KVM_SET_USER_MEMORY_REGION */
23421
+struct kvm_userspace_memory_region {
23424
__u64 guest_phys_addr;
23425
- __u64 memory_size;
23426
- __u64 target_phys_addr;
23427
+ __u64 memory_size; /* bytes */
23428
+ __u64 userspace_addr; /* start of the userspace allocated memory */
23431
+/* for kvm_memory_region::flags */
23432
+#define KVM_MEM_LOG_DIRTY_PAGES 1UL
23435
/* for KVM_IRQ_LINE */
23436
struct kvm_irq_level {
23438
@@ -45,62 +45,16 @@ struct kvm_irq_level {
23442
-/* for KVM_GET_IRQCHIP and KVM_SET_IRQCHIP */
23443
-struct kvm_pic_state {
23444
- __u8 last_irr; /* edge detection */
23445
- __u8 irr; /* interrupt request register */
23446
- __u8 imr; /* interrupt mask register */
23447
- __u8 isr; /* interrupt service register */
23448
- __u8 priority_add; /* highest irq priority */
23450
- __u8 read_reg_select;
23452
- __u8 special_mask;
23455
- __u8 rotate_on_auto_eoi;
23456
- __u8 special_fully_nested_mode;
23457
- __u8 init4; /* true if 4 byte init */
23458
- __u8 elcr; /* PIIX edge/trigger selection */
23462
-#define KVM_IOAPIC_NUM_PINS 24
23463
-struct kvm_ioapic_state {
23464
- __u64 base_address;
23473
- __u8 delivery_mode:3;
23474
- __u8 dest_mode:1;
23475
- __u8 delivery_status:1;
23477
- __u8 remote_irr:1;
23478
- __u8 trig_mode:1;
23481
- __u8 reserved[4];
23484
- } redirtbl[KVM_IOAPIC_NUM_PINS];
23487
-#define KVM_IRQCHIP_PIC_MASTER 0
23488
-#define KVM_IRQCHIP_PIC_SLAVE 1
23489
-#define KVM_IRQCHIP_IOAPIC 2
23491
struct kvm_irqchip {
23495
char dummy[512]; /* reserving space */
23497
struct kvm_pic_state pic;
23498
struct kvm_ioapic_state ioapic;
23503
@@ -179,15 +133,6 @@ struct kvm_run {
23507
-/* for KVM_GET_REGS and KVM_SET_REGS */
23509
- /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
23510
- __u64 rax, rbx, rcx, rdx;
23511
- __u64 rsi, rdi, rsp, rbp;
23512
- __u64 r8, r9, r10, r11;
23513
- __u64 r12, r13, r14, r15;
23514
- __u64 rip, rflags;
23517
/* for KVM_GET_FPU and KVM_SET_FPU */
23519
@@ -204,59 +149,6 @@ struct kvm_fpu {
23523
-/* for KVM_GET_LAPIC and KVM_SET_LAPIC */
23524
-#define KVM_APIC_REG_SIZE 0x400
23525
-struct kvm_lapic_state {
23526
- char regs[KVM_APIC_REG_SIZE];
23529
-struct kvm_segment {
23534
- __u8 present, dpl, db, s, l, g, avl;
23539
-struct kvm_dtable {
23542
- __u16 padding[3];
23545
-/* for KVM_GET_SREGS and KVM_SET_SREGS */
23546
-struct kvm_sregs {
23547
- /* out (KVM_GET_SREGS) / in (KVM_SET_SREGS) */
23548
- struct kvm_segment cs, ds, es, fs, gs, ss;
23549
- struct kvm_segment tr, ldt;
23550
- struct kvm_dtable gdt, idt;
23551
- __u64 cr0, cr2, cr3, cr4, cr8;
23554
- __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
23557
-struct kvm_msr_entry {
23563
-/* for KVM_GET_MSRS and KVM_SET_MSRS */
23565
- __u32 nmsrs; /* number of msrs in entries */
23568
- struct kvm_msr_entry entries[0];
23571
-/* for KVM_GET_MSR_INDEX_LIST */
23572
-struct kvm_msr_list {
23573
- __u32 nmsrs; /* number of msrs in entries */
23574
- __u32 indices[0];
23577
/* for KVM_TRANSLATE */
23578
struct kvm_translation {
23579
@@ -302,22 +194,6 @@ struct kvm_dirty_log {
23583
-struct kvm_cpuid_entry {
23592
-/* for KVM_SET_CPUID */
23593
-struct kvm_cpuid {
23596
- struct kvm_cpuid_entry entries[0];
23599
/* for KVM_SET_SIGNAL_MASK */
23600
struct kvm_signal_mask {
23602
@@ -347,11 +223,20 @@ struct kvm_signal_mask {
23604
#define KVM_CAP_IRQCHIP 0
23605
#define KVM_CAP_HLT 1
23606
+#define KVM_CAP_MMU_SHADOW_CACHE_CONTROL 2
23607
+#define KVM_CAP_USER_MEMORY 3
23608
+#define KVM_CAP_SET_TSS_ADDR 4
23609
+#define KVM_CAP_EXT_CPUID 5
23612
* ioctls for VM fds
23614
#define KVM_SET_MEMORY_REGION _IOW(KVMIO, 0x40, struct kvm_memory_region)
23615
+#define KVM_SET_NR_MMU_PAGES _IO(KVMIO, 0x44)
23616
+#define KVM_GET_NR_MMU_PAGES _IO(KVMIO, 0x45)
23617
+#define KVM_SET_USER_MEMORY_REGION _IOW(KVMIO, 0x46,\
23618
+ struct kvm_userspace_memory_region)
23619
+#define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47)
23621
* KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
23623
@@ -359,6 +244,7 @@ struct kvm_signal_mask {
23624
#define KVM_CREATE_VCPU _IO(KVMIO, 0x41)
23625
#define KVM_GET_DIRTY_LOG _IOW(KVMIO, 0x42, struct kvm_dirty_log)
23626
#define KVM_SET_MEMORY_ALIAS _IOW(KVMIO, 0x43, struct kvm_memory_alias)
23627
+#define KVM_GET_SUPPORTED_CPUID _IOWR(KVMIO, 0x48, struct kvm_cpuid2)
23628
/* Device model IOC */
23629
#define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60)
23630
#define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level)
23631
@@ -384,5 +270,7 @@ struct kvm_signal_mask {
23632
#define KVM_SET_FPU _IOW(KVMIO, 0x8d, struct kvm_fpu)
23633
#define KVM_GET_LAPIC _IOR(KVMIO, 0x8e, struct kvm_lapic_state)
23634
#define KVM_SET_LAPIC _IOW(KVMIO, 0x8f, struct kvm_lapic_state)
23635
+#define KVM_SET_CPUID2 _IOW(KVMIO, 0x90, struct kvm_cpuid2)
23636
+#define KVM_GET_CPUID2 _IOWR(KVMIO, 0x91, struct kvm_cpuid2)
23639
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
23640
index 3b29256..e4db25f 100644
23641
--- a/include/linux/kvm_para.h
23642
+++ b/include/linux/kvm_para.h
23644
#define __LINUX_KVM_PARA_H
23647
- * Guest OS interface for KVM paravirtualization
23649
- * Note: this interface is totally experimental, and is certain to change
23650
- * as we make progress.
23651
+ * This header file provides a method for making a hypercall to the host
23652
+ * Architectures should define:
23653
+ * - kvm_hypercall0, kvm_hypercall1...
23654
+ * - kvm_arch_para_features
23655
+ * - kvm_para_available
23659
- * Per-VCPU descriptor area shared between guest and host. Writable to
23660
- * both guest and host. Registered with the host by the guest when
23661
- * a guest acknowledges paravirtual mode.
23663
- * NOTE: all addresses are guest-physical addresses (gpa), to make it
23664
- * easier for the hypervisor to map between the various addresses.
23666
-struct kvm_vcpu_para_state {
23668
- * API version information for compatibility. If there's any support
23669
- * mismatch (too old host trying to execute too new guest) then
23670
- * the host will deny entry into paravirtual mode. Any other
23671
- * combination (new host + old guest and new host + new guest)
23672
- * is supposed to work - new host versions will support all old
23673
- * guest API versions.
23675
- u32 guest_version;
23676
- u32 host_version;
23681
- * The address of the vm exit instruction (VMCALL or VMMCALL),
23682
- * which the host will patch according to the CPU model the
23685
- u64 hypercall_gpa;
23687
-} __attribute__ ((aligned(PAGE_SIZE)));
23689
-#define KVM_PARA_API_VERSION 1
23690
+/* Return values for hypercalls */
23691
+#define KVM_ENOSYS 1000
23695
- * This is used for an RDMSR's ECX parameter to probe for a KVM host.
23696
- * Hopefully no CPU vendor will use up this number. This is placed well
23697
- * out of way of the typical space occupied by CPU vendors' MSR indices,
23698
- * and we think (or at least hope) it wont be occupied in the future
23700
+ * hypercalls use architecture specific
23702
-#define MSR_KVM_API_MAGIC 0x87655678
23704
-#define KVM_EINVAL 1
23705
+#include <asm/kvm_para.h>
23708
- * Hypercall calling convention:
23710
- * Each hypercall may have 0-6 parameters.
23712
- * 64-bit hypercall index is in RAX, goes from 0 to __NR_hypercalls-1
23714
- * 64-bit parameters 1-6 are in the standard gcc x86_64 calling convention
23715
- * order: RDI, RSI, RDX, RCX, R8, R9.
23717
- * 32-bit index is EBX, parameters are: EAX, ECX, EDX, ESI, EDI, EBP.
23718
- * (the first 3 are according to the gcc regparm calling convention)
23720
- * No registers are clobbered by the hypercall, except that the
23721
- * return value is in RAX.
23723
-#define __NR_hypercalls 0
23724
+static inline int kvm_para_has_feature(unsigned int feature)
23726
+ if (kvm_arch_para_features() & (1UL << feature))
23730
+#endif /* __KERNEL__ */
23731
+#endif /* __LINUX_KVM_PARA_H */
23734
diff --git a/include/linux/pnp.h b/include/linux/pnp.h
23735
index 0a0426c..664d68c 100644
23736
--- a/include/linux/pnp.h
23737
+++ b/include/linux/pnp.h
23739
#include <linux/errno.h>
23740
#include <linux/mod_devicetable.h>
23742
-#define PNP_MAX_PORT 24
23743
-#define PNP_MAX_MEM 12
23744
+#define PNP_MAX_PORT 8
23745
+#define PNP_MAX_MEM 4
23746
#define PNP_MAX_IRQ 2
23747
#define PNP_MAX_DMA 2
23748
#define PNP_NAME_LEN 50
23749
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
23750
index f2d0d15..6d5e4a4 100644
23751
--- a/include/linux/rtc.h
23752
+++ b/include/linux/rtc.h
23753
@@ -133,9 +133,6 @@ struct rtc_class_ops {
23754
#define RTC_DEVICE_NAME_SIZE 20
23758
-#define RTC_DEV_BUSY 0
23763
@@ -148,7 +145,7 @@ struct rtc_device
23764
struct mutex ops_lock;
23766
struct cdev char_dev;
23767
- unsigned long flags;
23768
+ struct mutex char_lock;
23770
unsigned long irq_data;
23771
spinlock_t irq_lock;
23772
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
23773
index 416e000..2597350 100644
23774
--- a/include/linux/scatterlist.h
23775
+++ b/include/linux/scatterlist.h
23778
#define SG_MAGIC 0x87654321
23781
- * We overload the LSB of the page pointer to indicate whether it's
23782
- * a valid sg entry, or whether it points to the start of a new scatterlist.
23783
- * Those low bits are there for everyone! (thanks mason :-)
23785
-#define sg_is_chain(sg) ((sg)->page_link & 0x01)
23786
-#define sg_is_last(sg) ((sg)->page_link & 0x02)
23787
-#define sg_chain_ptr(sg) \
23788
- ((struct scatterlist *) ((sg)->page_link & ~0x03))
23791
* sg_assign_page - Assign a given page to an SG entry
23793
@@ -57,7 +47,6 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
23794
BUG_ON((unsigned long) page & 0x03);
23795
#ifdef CONFIG_DEBUG_SG
23796
BUG_ON(sg->sg_magic != SG_MAGIC);
23797
- BUG_ON(sg_is_chain(sg));
23799
sg->page_link = page_link | (unsigned long) page;
23801
@@ -84,14 +73,7 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page,
23805
-static inline struct page *sg_page(struct scatterlist *sg)
23807
-#ifdef CONFIG_DEBUG_SG
23808
- BUG_ON(sg->sg_magic != SG_MAGIC);
23809
- BUG_ON(sg_is_chain(sg));
23811
- return (struct page *)((sg)->page_link & ~0x3);
23813
+#define sg_page(sg) ((struct page *) ((sg)->page_link & ~0x3))
23816
* sg_set_buf - Set sg entry to point at given data
23817
@@ -106,6 +88,16 @@ static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
23818
sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
23822
+ * We overload the LSB of the page pointer to indicate whether it's
23823
+ * a valid sg entry, or whether it points to the start of a new scatterlist.
23824
+ * Those low bits are there for everyone! (thanks mason :-)
23826
+#define sg_is_chain(sg) ((sg)->page_link & 0x01)
23827
+#define sg_is_last(sg) ((sg)->page_link & 0x02)
23828
+#define sg_chain_ptr(sg) \
23829
+ ((struct scatterlist *) ((sg)->page_link & ~0x03))
23832
* sg_next - return the next scatterlist entry in a list
23833
* @sg: The current sg entry
23834
@@ -187,13 +179,6 @@ static inline void sg_chain(struct scatterlist *prv, unsigned int prv_nents,
23835
#ifndef ARCH_HAS_SG_CHAIN
23840
- * offset and length are unused for chain entry. Clear them.
23846
* Set lowest bit to indicate a link pointer, and make sure to clear
23847
* the termination bit if it happens to be set.
23848
diff --git a/include/linux/sched.h b/include/linux/sched.h
23849
index ac3d496..ee800e7 100644
23850
--- a/include/linux/sched.h
23851
+++ b/include/linux/sched.h
23852
@@ -282,10 +282,6 @@ static inline void touch_all_softlockup_watchdogs(void)
23854
/* Attach to any functions which should be ignored in wchan output. */
23855
#define __sched __attribute__((__section__(".sched.text")))
23857
-/* Linker adds these: start and end of __sched functions */
23858
-extern char __sched_text_start[], __sched_text_end[];
23860
/* Is this address in the __sched functions? */
23861
extern int in_sched_functions(unsigned long addr);
23863
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
23864
index 1ee2c05..827b85b 100644
23865
--- a/include/linux/screen_info.h
23866
+++ b/include/linux/screen_info.h
23867
@@ -63,8 +63,6 @@ struct screen_info {
23869
#define VIDEO_TYPE_PMAC 0x60 /* PowerMacintosh frame buffer. */
23871
-#define VIDEO_TYPE_EFI 0x70 /* EFI graphic mode */
23874
extern struct screen_info screen_info;
23876
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
23877
index 9963f81..6a5203f 100644
23878
--- a/include/linux/serial_core.h
23879
+++ b/include/linux/serial_core.h
23880
@@ -437,7 +437,7 @@ uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
23881
#ifdef SUPPORT_SYSRQ
23883
if (ch && time_before(jiffies, port->sysrq)) {
23884
- handle_sysrq(ch, port->info ? port->info->tty : NULL);
23885
+ handle_sysrq(ch, port->info->tty);
23889
diff --git a/include/linux/usb.h b/include/linux/usb.h
23890
index 416ee76..c5c8f16 100644
23891
--- a/include/linux/usb.h
23892
+++ b/include/linux/usb.h
23893
@@ -157,7 +157,6 @@ struct usb_interface {
23895
enum usb_interface_condition condition; /* state of binding */
23896
unsigned is_active:1; /* the interface is not suspended */
23897
- unsigned sysfs_files_created:1; /* the sysfs attributes exist */
23898
unsigned needs_remote_wakeup:1; /* driver requires remote wakeup */
23900
struct device dev; /* interface specific device info */
23901
diff --git a/include/linux/usbdevice_fs.h b/include/linux/usbdevice_fs.h
23902
index 8ca5a7f..342dd5a 100644
23903
--- a/include/linux/usbdevice_fs.h
23904
+++ b/include/linux/usbdevice_fs.h
23905
@@ -102,8 +102,7 @@ struct usbdevfs_urb {
23907
int number_of_packets;
23909
- unsigned int signr; /* signal to be sent on completion,
23910
- or 0 if none should be sent. */
23911
+ unsigned int signr; /* signal to be sent on error, -1 if none should be sent */
23913
struct usbdevfs_iso_packet_desc iso_frame_desc[0];
23915
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
23916
index 6ca7b97..1e04cd4 100644
23919
@@ -1138,10 +1138,8 @@ asmlinkage long sys_mq_getsetattr(mqd_t mqdes,
23920
omqstat.mq_flags = filp->f_flags & O_NONBLOCK;
23922
ret = audit_mq_getsetattr(mqdes, &mqstat);
23924
- spin_unlock(&info->lock);
23929
if (mqstat.mq_flags & O_NONBLOCK)
23930
filp->f_flags |= O_NONBLOCK;
23932
diff --git a/kernel/exit.c b/kernel/exit.c
23933
index 549c055..cd0f1d4 100644
23934
--- a/kernel/exit.c
23935
+++ b/kernel/exit.c
23936
@@ -1357,7 +1357,7 @@ static int wait_task_stopped(struct task_struct *p, int delayed_group_leader,
23937
int __user *stat_addr, struct rusage __user *ru)
23939
int retval, exit_code;
23941
+ struct pid_namespace *ns;
23945
@@ -1376,11 +1376,12 @@ static int wait_task_stopped(struct task_struct *p, int delayed_group_leader,
23946
* keep holding onto the tasklist_lock while we call getrusage and
23947
* possibly take page faults for user memory.
23949
- pid = task_pid_nr_ns(p, current->nsproxy->pid_ns);
23950
+ ns = current->nsproxy->pid_ns;
23951
get_task_struct(p);
23952
read_unlock(&tasklist_lock);
23954
if (unlikely(noreap)) {
23955
+ pid_t pid = task_pid_nr_ns(p, ns);
23956
uid_t uid = p->uid;
23957
int why = (p->ptrace & PT_PTRACED) ? CLD_TRAPPED : CLD_STOPPED;
23959
@@ -1388,7 +1389,7 @@ static int wait_task_stopped(struct task_struct *p, int delayed_group_leader,
23960
if (unlikely(!exit_code) || unlikely(p->exit_state))
23962
return wait_noreap_copyout(p, pid, uid,
23964
+ why, (exit_code << 8) | 0x7f,
23968
@@ -1450,11 +1451,11 @@ bail_ref:
23969
if (!retval && infop)
23970
retval = put_user(exit_code, &infop->si_status);
23971
if (!retval && infop)
23972
- retval = put_user(pid, &infop->si_pid);
23973
+ retval = put_user(task_pid_nr_ns(p, ns), &infop->si_pid);
23974
if (!retval && infop)
23975
retval = put_user(p->uid, &infop->si_uid);
23978
+ retval = task_pid_nr_ns(p, ns);
23979
put_task_struct(p);
23982
diff --git a/kernel/fork.c b/kernel/fork.c
23983
index 8ca1a14..89c0087 100644
23984
--- a/kernel/fork.c
23985
+++ b/kernel/fork.c
23986
@@ -392,6 +392,7 @@ void fastcall __mmdrop(struct mm_struct *mm)
23987
destroy_context(mm);
23990
+EXPORT_SYMBOL_GPL(__mmdrop);
23993
* Decrement the use count and release all resources for an mm.
23994
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
23995
index 2fc2581..474219a 100644
23996
--- a/kernel/kallsyms.c
23997
+++ b/kernel/kallsyms.c
24000
/* These will be re-linked against their real values during the second link stage */
24001
extern const unsigned long kallsyms_addresses[] __attribute__((weak));
24002
+extern const unsigned long kallsyms_num_syms __attribute__((weak));
24003
extern const u8 kallsyms_names[] __attribute__((weak));
24005
-/* tell the compiler that the count isn't in the small data section if the arch
24006
- * has one (eg: FRV)
24008
-extern const unsigned long kallsyms_num_syms
24009
-__attribute__((weak, section(".rodata")));
24011
extern const u8 kallsyms_token_table[] __attribute__((weak));
24012
extern const u16 kallsyms_token_index[] __attribute__((weak));
24014
diff --git a/kernel/sched.c b/kernel/sched.c
24015
index 98dcdf2..38933ca 100644
24016
--- a/kernel/sched.c
24017
+++ b/kernel/sched.c
24018
@@ -5466,7 +5466,7 @@ sd_alloc_ctl_domain_table(struct sched_domain *sd)
24022
-static ctl_table *sd_alloc_ctl_cpu_table(int cpu)
24023
+static ctl_table * sd_alloc_ctl_cpu_table(int cpu)
24025
struct ctl_table *entry, *table;
24026
struct sched_domain *sd;
24027
@@ -6708,6 +6708,9 @@ void __init sched_init_smp(void)
24029
int in_sched_functions(unsigned long addr)
24031
+ /* Linker adds these: start and end of __sched functions */
24032
+ extern char __sched_text_start[], __sched_text_end[];
24034
return in_lock_functions(addr) ||
24035
(addr >= (unsigned long)__sched_text_start
24036
&& addr < (unsigned long)__sched_text_end);
24037
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
24038
index d30467b..5d0d623 100644
24039
--- a/kernel/sched_debug.c
24040
+++ b/kernel/sched_debug.c
24041
@@ -327,12 +327,10 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
24044
avg_per_cpu = p->se.sum_exec_runtime;
24045
- if (p->se.nr_migrations) {
24046
- avg_per_cpu = div64_64(avg_per_cpu,
24047
- p->se.nr_migrations);
24049
+ if (p->se.nr_migrations)
24050
+ avg_per_cpu = div64_64(avg_per_cpu, p->se.nr_migrations);
24052
avg_per_cpu = -1LL;
24057
diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
24058
index 5b32433..630178e 100644
24059
--- a/kernel/sched_stats.h
24060
+++ b/kernel/sched_stats.h
24061
@@ -52,8 +52,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
24062
sd->lb_nobusyq[itype],
24063
sd->lb_nobusyg[itype]);
24066
- " %u %u %u %u %u %u %u %u %u %u %u %u\n",
24067
+ seq_printf(seq, " %u %u %u %u %u %u %u %u %u %u %u %u\n",
24068
sd->alb_count, sd->alb_failed, sd->alb_pushed,
24069
sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
24070
sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
24071
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
24072
index cb89fa8..27a2338 100644
24073
--- a/kernel/time/tick-sched.c
24074
+++ b/kernel/time/tick-sched.c
24075
@@ -133,8 +133,6 @@ void tick_nohz_update_jiffies(void)
24076
if (!ts->tick_stopped)
24079
- touch_softlockup_watchdog();
24081
cpu_clear(cpu, nohz_cpu_mask);
24084
diff --git a/kernel/utsname_sysctl.c b/kernel/utsname_sysctl.c
24085
index fe3a56c..c76c064 100644
24086
--- a/kernel/utsname_sysctl.c
24087
+++ b/kernel/utsname_sysctl.c
24089
static void *get_uts(ctl_table *table, int write)
24091
char *which = table->data;
24092
- struct uts_namespace *uts_ns;
24094
- uts_ns = current->nsproxy->uts_ns;
24095
- which = (which - (char *)&init_uts_ns) + (char *)uts_ns;
24098
down_read(&uts_sem);
24099
diff --git a/lib/hexdump.c b/lib/hexdump.c
24100
index 3435465..bd5edae 100644
24101
--- a/lib/hexdump.c
24102
+++ b/lib/hexdump.c
24103
@@ -106,8 +106,7 @@ void hex_dump_to_buffer(const void *buf, size_t len, int rowsize,
24104
while (lx < (linebuflen - 1) && lx < (ascii_column - 1))
24105
linebuf[lx++] = ' ';
24106
for (j = 0; (j < rowsize) && (j < len) && (lx + 2) < linebuflen; j++)
24107
- linebuf[lx++] = (isascii(ptr[j]) && isprint(ptr[j])) ? ptr[j]
24109
+ linebuf[lx++] = isprint(ptr[j]) ? ptr[j] : '.';
24111
linebuf[lx++] = '\0';
24113
diff --git a/lib/kobject.c b/lib/kobject.c
24114
index b52e9f4..a7e3bf4 100644
24115
--- a/lib/kobject.c
24116
+++ b/lib/kobject.c
24117
@@ -313,8 +313,8 @@ int kobject_rename(struct kobject * kobj, const char *new_name)
24118
struct kobject *temp_kobj;
24119
temp_kobj = kset_find_obj(kobj->kset, new_name);
24121
- printk(KERN_WARNING "kobject '%s' cannot be renamed "
24122
- "to '%s' as '%s' is already in existence.\n",
24123
+ printk(KERN_WARNING "kobject '%s' can not be renamed "
24124
+ "to '%s' as '%s' is already in existance.\n",
24125
kobject_name(kobj), new_name, new_name);
24126
kobject_put(temp_kobj);
24128
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
24129
index b5a58d4..12376ae 100644
24130
--- a/mm/page_alloc.c
24131
+++ b/mm/page_alloc.c
24132
@@ -305,6 +305,7 @@ static inline void prep_zero_page(struct page *page, int order, gfp_t gfp_flags)
24136
+ VM_BUG_ON((gfp_flags & (__GFP_WAIT | __GFP_HIGHMEM)) == __GFP_HIGHMEM);
24138
* clear_highpage() will use KM_USER0, so it's a bug to use __GFP_ZERO
24139
* and __GFP_HIGHMEM from hard or soft interrupt context.
24140
@@ -3265,16 +3266,6 @@ static void inline setup_usemap(struct pglist_data *pgdat,
24141
#endif /* CONFIG_SPARSEMEM */
24143
#ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
24145
-/* Return a sensible default order for the pageblock size. */
24146
-static inline int pageblock_default_order(void)
24148
- if (HPAGE_SHIFT > PAGE_SHIFT)
24149
- return HUGETLB_PAGE_ORDER;
24151
- return MAX_ORDER-1;
24154
/* Initialise the number of pages represented by NR_PAGEBLOCK_BITS */
24155
static inline void __init set_pageblock_order(unsigned int order)
24157
@@ -3290,16 +3281,7 @@ static inline void __init set_pageblock_order(unsigned int order)
24159
#else /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */
24162
- * When CONFIG_HUGETLB_PAGE_SIZE_VARIABLE is not set, set_pageblock_order()
24163
- * and pageblock_default_order() are unused as pageblock_order is set
24164
- * at compile-time. See include/linux/pageblock-flags.h for the values of
24165
- * pageblock_order based on the kernel config
24167
-static inline int pageblock_default_order(unsigned int order)
24169
- return MAX_ORDER-1;
24171
+/* Defined this way to avoid accidently referencing HUGETLB_PAGE_ORDER */
24172
#define set_pageblock_order(x) do {} while (0)
24174
#endif /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */
24175
@@ -3384,7 +3366,7 @@ static void __meminit free_area_init_core(struct pglist_data *pgdat,
24179
- set_pageblock_order(pageblock_default_order());
24180
+ set_pageblock_order(HUGETLB_PAGE_ORDER);
24181
setup_usemap(pgdat, zone, size);
24182
ret = init_currently_empty_zone(zone, zone_start_pfn,
24183
size, MEMMAP_EARLY);
24184
diff --git a/mm/shmem.c b/mm/shmem.c
24185
index 51b3d6c..253d205 100644
24188
@@ -1072,7 +1072,7 @@ shmem_alloc_page(gfp_t gfp, struct shmem_inode_info *info,
24189
pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, idx);
24190
pvma.vm_pgoff = idx;
24191
pvma.vm_end = PAGE_SIZE;
24192
- page = alloc_page_vma(gfp, &pvma, 0);
24193
+ page = alloc_page_vma(gfp | __GFP_ZERO, &pvma, 0);
24194
mpol_free(pvma.vm_policy);
24197
@@ -1093,7 +1093,7 @@ shmem_swapin(struct shmem_inode_info *info,swp_entry_t entry,unsigned long idx)
24198
static inline struct page *
24199
shmem_alloc_page(gfp_t gfp,struct shmem_inode_info *info, unsigned long idx)
24201
- return alloc_page(gfp);
24202
+ return alloc_page(gfp | __GFP_ZERO);
24206
@@ -1306,7 +1306,6 @@ repeat:
24209
spin_unlock(&info->lock);
24210
- clear_highpage(filepage);
24211
flush_dcache_page(filepage);
24212
SetPageUptodate(filepage);
24214
diff --git a/mm/slab.c b/mm/slab.c
24215
index 202465a..c31cd36 100644
24218
@@ -2881,8 +2881,6 @@ static void *cache_free_debugcheck(struct kmem_cache *cachep, void *objp,
24219
unsigned int objnr;
24220
struct slab *slabp;
24222
- BUG_ON(virt_to_cache(objp) != cachep);
24224
objp -= obj_offset(cachep);
24225
kfree_debugcheck(objp);
24226
page = virt_to_head_page(objp);
24227
@@ -3761,6 +3759,8 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp)
24229
unsigned long flags;
24231
+ BUG_ON(virt_to_cache(objp) != cachep);
24233
local_irq_save(flags);
24234
debug_check_no_locks_freed(objp, obj_size(cachep));
24235
__cache_free(cachep, objp);
24236
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
24237
index cd75b21..22620f6 100644
24238
--- a/mm/sparse-vmemmap.c
24239
+++ b/mm/sparse-vmemmap.c
24241
* or to back the page tables that are used to create the mapping.
24242
* Uses the main allocators if they are available, else bootmem.
24245
-static void * __init_refok __earlyonly_bootmem_alloc(int node,
24246
- unsigned long size,
24247
- unsigned long align,
24248
- unsigned long goal)
24250
- return __alloc_bootmem_node(NODE_DATA(node), size, align, goal);
24254
void * __meminit vmemmap_alloc_block(unsigned long size, int node)
24256
/* If the main allocator is up use that, fallback to bootmem. */
24257
@@ -54,7 +44,7 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int node)
24258
return page_address(page);
24261
- return __earlyonly_bootmem_alloc(node, size, size,
24262
+ return __alloc_bootmem_node(NODE_DATA(node), size, size,
24263
__pa(MAX_DMA_ADDRESS));
24266
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
24267
index 579f50f..cbb4258 100755
24268
--- a/scripts/checkpatch.pl
24269
+++ b/scripts/checkpatch.pl
24270
@@ -9,7 +9,7 @@ use strict;
24277
use Getopt::Long qw(:config no_auto_abbrev);
24279
@@ -19,11 +19,8 @@ my $chk_signoff = 1;
24290
'q|quiet+' => \$quiet,
24291
@@ -32,13 +29,10 @@ GetOptions(
24292
'patch!' => \$chk_patch,
24293
'test-type!' => \$tst_type,
24294
'emacs!' => \$emacs,
24295
- 'terse!' => \$terse,
24297
'subjective!' => \$check,
24298
'strict!' => \$check,
24299
'root=s' => \$root,
24300
- 'summary!' => \$summary,
24301
- 'mailback!' => \$mailback,
24305
@@ -48,7 +42,6 @@ if ($#ARGV < 0) {
24306
print "version: $V\n";
24307
print "options: -q => quiet\n";
24308
print " --no-tree => run without a kernel tree\n";
24309
- print " --terse => one line per report\n";
24310
print " --emacs => emacs compile window format\n";
24311
print " --file => check a source file\n";
24312
print " --strict => enable more subjective tests\n";
24313
@@ -56,11 +49,6 @@ if ($#ARGV < 0) {
24323
if (defined $root) {
24324
if (!top_of_kernel_tree($root)) {
24325
@@ -102,6 +90,41 @@ our $Attribute = qr{
24326
__(?:mem|cpu|dev|)(?:initdata|init)
24328
our $Inline = qr{inline|__always_inline|noinline};
24329
+our $NonptrType = qr{
24345
+ long\s+long\s+int|
24346
+ (?:__)?(?:u|s|be|le)(?:8|16|32|64)|
24351
+ ${Ident}_handler|
24352
+ ${Ident}_handler_fn
24360
+ (?:\s*\*+\s*const|\s*\*+|(?:\s*\[\s*\])+)?
24361
+ (?:\s+$Sparse|\s+$Attribute)*
24363
+our $Declare = qr{(?:$Storage\s+)?$Type};
24364
our $Member = qr{->$Ident|\.$Ident|\[[^]]*\]};
24365
our $Lval = qr{$Ident(?:$Member)*};
24367
@@ -113,50 +136,7 @@ our $Operators = qr{
24368
&&|\|\||,|\^|\+\+|--|&|\||\+|-|\*|\/
24387
- qr{long\s+long\s+int},
24388
- qr{(?:__)?(?:u|s|be|le)(?:8|16|32|64)},
24389
- qr{struct\s+$Ident},
24390
- qr{union\s+$Ident},
24391
- qr{enum\s+$Ident},
24393
- qr{${Ident}_handler},
24394
- qr{${Ident}_handler_fn},
24398
- my $all = "(?: \n" . join("|\n ", @typeList) . "\n)";
24399
- $NonptrType = qr{
24404
- (?:\s+$Sparse|\s+const)*
24409
- (?:\s*\*+\s*const|\s*\*+|(?:\s*\[\s*\])+)?
24410
- (?:\s+$Sparse|\s+$Attribute)*
24412
- $Declare = qr{(?:$Storage\s+)?$Type};
24417
$chk_signoff = 0 if ($file);
24419
@@ -298,81 +278,6 @@ sub sanitise_line {
24423
-sub ctx_statement_block {
24424
- my ($linenr, $remain, $off) = @_;
24425
- my $line = $linenr - 1;
24428
- my $coff = $off - 1;
24435
- #warn "CSB: blk<$blk>\n";
24436
- # If we are about to drop off the end, pull in more
24438
- if ($off >= $len) {
24439
- for (; $remain > 0; $line++) {
24440
- next if ($rawlines[$line] =~ /^-/);
24442
- $blk .= sanitise_line($rawlines[$line]) . "\n";
24443
- $len = length($blk);
24447
- # Bail if there is no further context.
24448
- #warn "CSB: blk<$blk> off<$off> len<$len>\n";
24449
- if ($off == $len) {
24453
- $c = substr($blk, $off, 1);
24455
- #warn "CSB: c<$c> type<$type> level<$level>\n";
24456
- # Statement ends at the ';' or a close '}' at the
24457
- # outermost level.
24458
- if ($level == 0 && $c eq ';') {
24462
- if (($type eq '' || $type eq '(') && $c eq '(') {
24466
- if ($type eq '(' && $c eq ')') {
24468
- $type = ($level != 0)? '(' : '';
24470
- if ($level == 0 && $coff < $soff) {
24474
- if (($type eq '' || $type eq '{') && $c eq '{') {
24478
- if ($type eq '{' && $c eq '}') {
24480
- $type = ($level != 0)? '{' : '';
24482
- if ($level == 0) {
24489
- my $statement = substr($blk, $soff, $off - $soff + 1);
24490
- my $condition = substr($blk, $soff, $coff - $soff + 1);
24492
- #warn "STATEMENT<$statement>\n";
24493
- #warn "CONDITION<$condition>\n";
24495
- return ($statement, $condition);
24498
sub ctx_block_get {
24499
my ($linenr, $remain, $outer, $open, $close, $off) = @_;
24501
@@ -516,6 +421,9 @@ sub annotate_values {
24505
+ # Include any user defined types we may have found as we went.
24506
+ my $type_match = "(?:$Type$Bare)";
24508
while (length($cur)) {
24509
print " <$type> " if ($debug);
24510
if ($cur =~ /^(\s+)/o) {
24511
@@ -525,7 +433,7 @@ sub annotate_values {
24515
- } elsif ($cur =~ /^($Type)/) {
24516
+ } elsif ($cur =~ /^($type_match)/) {
24517
print "DECLARE($1)\n" if ($debug);
24520
@@ -549,7 +457,7 @@ sub annotate_values {
24524
- } elsif ($cur =~ /^(if|while|typeof|for)\b/o) {
24525
+ } elsif ($cur =~ /^(if|while|typeof)\b/o) {
24526
print "COND($1)\n" if ($debug);
24527
$paren_type[$paren] = 'N';
24529
@@ -607,30 +515,11 @@ sub annotate_values {
24534
- my ($possible) = @_;
24536
- #print "CHECK<$possible>\n";
24537
- if ($possible !~ /^(?:$Storage|$Type|DEFINE_\S+)$/ &&
24538
- $possible ne 'goto' && $possible ne 'return' &&
24539
- $possible ne 'struct' && $possible ne 'enum' &&
24540
- $possible ne 'case' && $possible ne 'else' &&
24541
- $possible ne 'typedef') {
24542
- #print "POSSIBLE<$possible>\n";
24543
- push(@typeList, $possible);
24552
- my $line = $prefix . $_[0];
24554
- $line = (split('\n', $line))[0] . "\n" if ($terse);
24556
- push(@report, $line);
24557
+ push(@report, $prefix . $_[0]);
24561
@@ -685,6 +574,9 @@ sub process {
24563
my $prev_values = 'N';
24565
+ # Possible bare types.
24568
# Pre-scan the patch looking for any __setup documentation.
24569
my @setup_docs = ();
24570
my $setup_docs = 0;
24571
@@ -739,35 +631,21 @@ sub process {
24573
$realcnt-- if ($realcnt != 0);
24575
- # Guestimate if this is a continuing comment. Run
24576
- # the context looking for a comment "edge". If this
24577
- # edge is a close comment then we must be in a comment
24578
- # at context start.
24579
- if ($linenr == $first_line) {
24581
- for (my $ln = $first_line; $ln < ($linenr + $realcnt); $ln++) {
24582
- ($edge) = ($lines[$ln - 1] =~ m@(/\*|\*/)@);
24583
- last if (defined $edge);
24585
- if (defined $edge && $edge eq '*/') {
24590
+ # track any sort of multi-line comment. Obviously if
24591
+ # the added text or context do not include the whole
24592
+ # comment we will not see it. Such is life.
24594
# Guestimate if this is a continuing comment. If this
24595
# is the start of a diff block and this line starts
24596
# ' *' then it is very likely a comment.
24597
if ($linenr == $first_line and $line =~ m@^.\s*\*@) {
24601
- # Find the last comment edge on _this_ line.
24602
- while (($line =~ m@(/\*|\*/)@g)) {
24603
- if ($1 eq '/*') {
24608
+ if ($line =~ m@/\*@) {
24611
+ if ($line =~ m@\*/@) {
24615
# Measure the line length and indent.
24616
@@ -809,7 +687,7 @@ sub process {
24619
# Check for wrappage within a valid hunk of the file
24620
- if ($realcnt != 0 && $line !~ m{^(?:\+|-| |\\ No newline|$)}) {
24621
+ if ($realcnt != 0 && $line !~ m{^(?:\+|-| |$)}) {
24622
ERROR("patch seems to be corrupt (line wrapped?)\n" .
24623
$herecurr) if (!$emitted_corrupt++);
24625
@@ -849,11 +727,6 @@ sub process {
24626
WARN("line over 80 characters\n" . $herecurr);
24629
-# check for adding lines without a newline.
24630
- if ($line =~ /^\+/ && defined $lines[$linenr] && $lines[$linenr] =~ /^\\ No newline at end of file/) {
24631
- WARN("adding a line without newline at end of file\n" . $herecurr);
24634
# check we are in a valid source file *.[hc] if not then ignore this hunk
24635
next if ($realfile !~ /\.[hc]$/);
24637
@@ -879,41 +752,30 @@ sub process {
24639
# Check for potential 'bare' types
24641
+ $line !~ /^.\s*(?:$Storage\s+)?(?:$Inline\s+)?$Type\b/ &&
24642
$line !~ /$Ident:\s*$/ &&
24643
- ($line =~ /^.\s*$Ident\s*\(\*+\s*$Ident\)\s*\(/ ||
24644
- $line !~ /^.\s*$Ident\s*\(/)) {
24645
- # definitions in global scope can only start with types
24646
- if ($line =~ /^.(?:$Storage\s+)?(?:$Inline\s+)?(?:const\s+)?($Ident)\b/) {
24649
- # declarations always start with types
24650
- } elsif ($prev_values eq 'N' && $line =~ /^.\s*(?:$Storage\s+)?($Ident)\b\s*\**\s*$Ident\s*(?:;|=)/) {
24653
- # any (foo ... *) is a pointer cast, and foo is a type
24654
- } elsif ($line =~ /\(($Ident)(?:\s+$Sparse)*\s*\*+\s*\)/) {
24658
- # Check for any sort of function declaration.
24659
- # int foo(something bar, other baz);
24660
- # void (*store_gdt)(x86_descr_ptr *);
24661
- if ($prev_values eq 'N' && $line =~ /^(.(?:(?:$Storage|$Inline)\s*)*\s*$Type\s*(?:\b$Ident|\(\*\s*$Ident\))\s*)\(/) {
24662
- my ($name_len) = length($1);
24663
- my ($level, @ctx) = ctx_statement_level($linenr, $realcnt, $name_len);
24664
- my $ctx = join("\n", @ctx);
24667
- substr($ctx, 0, $name_len + 1) = '';
24668
- $ctx =~ s/\)[^\)]*$//;
24669
- for my $arg (split(/\s*,\s*/, $ctx)) {
24670
- if ($arg =~ /^(?:const\s+)?($Ident)(?:\s+$Sparse)*\s*\**\s*(:?\b$Ident)?$/ || $arg =~ /^($Ident)$/) {
24675
+ $line !~ /^.\s*$Ident\s*\(/ &&
24676
+ # definitions in global scope can only start with types
24677
+ ($line =~ /^.(?:$Storage\s+)?(?:$Inline\s+)?($Ident)\b/ ||
24678
+ # declarations always start with types
24679
+ $line =~ /^.\s*(?:$Storage\s+)?($Ident)\b\s*\**\s*$Ident\s*(?:;|=)/) ||
24680
+ # any (foo ... *) is a pointer cast, and foo is a type
24681
+ $line =~ /\(($Ident)(?:\s+$Sparse)*\s*\*+\s*\)/) {
24682
+ my $possible = $1;
24683
+ if ($possible !~ /^(?:$Storage|$Type|DEFINE_\S+)$/ &&
24684
+ $possible ne 'goto' && $possible ne 'return' &&
24685
+ $possible ne 'struct' && $possible ne 'enum' &&
24686
+ $possible ne 'case' && $possible ne 'else' &&
24687
+ $possible ne 'typedef') {
24688
+ #print "POSSIBLE<$possible>\n";
24689
+ push(@bare, $possible);
24690
+ my $bare = join("|", @bare);
24691
+ $Bare = '|' . qr{
24693
+ (?:\s*\*+\s*const|\s*\*+|(?:\s*\[\s*\])+)?
24701
@@ -1073,10 +935,6 @@ sub process {
24705
- if ($line =~ /\bLINUX_VERSION_CODE\b/) {
24706
- WARN("LINUX_VERSION_CODE should be avoided, code should be for the version to which it is merged" . $herecurr);
24709
# printk should use KERN_* levels. Note that follow on printk's on the
24710
# same line do not need a level, so we use the current block context
24711
# to try and find and validate the current printk. In summary the current
24712
@@ -1107,12 +965,6 @@ sub process {
24713
ERROR("open brace '{' following function declarations go on the next line\n" . $herecurr);
24716
-# open braces for enum, union and struct go on the same line.
24717
- if ($line =~ /^.\s*{/ &&
24718
- $prevline =~ /^.\s*(?:typedef\s+)?(enum|union|struct)(?:\s+$Ident)?\s*$/) {
24719
- ERROR("open brace '{' following $1 go on the same line\n" . $hereprev);
24722
# check for spaces between functions and their parentheses.
24723
while ($line =~ /($Ident)\s+\(/g) {
24724
if ($1 !~ /^(?:if|for|while|switch|return|volatile|__volatile__|__attribute__|format|__extension__|Copyright|case)$/ &&
24725
@@ -1320,27 +1172,9 @@ sub process {
24728
# Check for illegal assignment in if conditional.
24729
- if ($line =~ /\bif\s*\(/) {
24730
- my ($s, $c) = ctx_statement_block($linenr, $realcnt, 0);
24732
- if ($c =~ /\bif\s*\(.*[^<>!=]=[^=].*/) {
24733
- ERROR("do not use assignment in if condition ($c)\n" . $herecurr);
24736
- # Find out what is on the end of the line after the
24738
- substr($s, 0, length($c)) = '';
24741
- if (length($c) && $s !~ /^\s*({|;|\/\*.*\*\/)?\s*\\*\s*$/) {
24742
- ERROR("trailing statements should be on next line\n" . $herecurr);
24746
-# if and else should not have general statements after it
24747
- if ($line =~ /^.\s*(?:}\s*)?else\b(.*)/ &&
24748
- $1 !~ /^\s*(?:\sif|{|\\|$)/) {
24749
- ERROR("trailing statements should be on next line\n" . $herecurr);
24750
+ if ($line=~/\bif\s*\(.*[^<>!=]=[^=]/) {
24751
+ #next if ($line=~/\".*\Q$op\E.*\"/ or $line=~/\'\Q$op\E\'/);
24752
+ ERROR("do not use assignment in if condition\n" . $herecurr);
24755
# Check for }<nl>else {, these must be at the same
24756
@@ -1371,6 +1205,12 @@ sub process {
24760
+# if and else should not have general statements after it
24761
+ if ($line =~ /^.\s*(?:}\s*)?else\b(.*)/ &&
24762
+ $1 !~ /^\s*(?:\sif|{|\\|$)/) {
24763
+ ERROR("trailing statements should be on next line\n" . $herecurr);
24766
# multi-statement macros should be enclosed in a do while loop, grab the
24767
# first statement and ensure its the whole macro if its not enclosed
24768
# in a known goot container
24769
@@ -1393,10 +1233,6 @@ sub process {
24773
- while ($lines[$ln - 1] =~ /^-/) {
24778
my @ctx = ctx_statement($ln, $cnt, $off);
24779
my $ctx_ln = $ln + $#ctx + 1;
24780
@@ -1432,23 +1268,25 @@ sub process {
24781
if ($lines[$nr - 1] =~ /{\s*$/) {
24782
my ($lvl, @block) = ctx_block_level($nr, $cnt);
24784
- my $stmt = join("\n", @block);
24785
- # Drop the diff line leader.
24786
- $stmt =~ s/\n./\n/g;
24787
- # Drop the code outside the block.
24788
- $stmt =~ s/(^[^{]*){\s*//;
24789
+ my $stmt = join(' ', @block);
24790
+ $stmt =~ s/(^[^{]*){//;
24792
- $stmt =~ s/\s*}([^}]*$)//;
24793
+ $stmt =~ s/}([^}]*$)//;
24796
#print "block<" . join(' ', @block) . "><" . scalar(@block) . ">\n";
24797
#print "stmt<$stmt>\n\n";
24799
- # Count the newlines, if there is only one
24800
- # then the block should not have {}'s.
24801
- my @lines = ($stmt =~ /\n/g);
24802
- #print "lines<" . scalar(@lines) . ">\n";
24803
- if ($lvl == 0 && scalar(@lines) == 0 &&
24804
+ # Count the ;'s if there is fewer than two
24805
+ # then there can only be one statement,
24806
+ # if there is a brace inside we cannot
24807
+ # trivially detect if its one statement.
24808
+ # Also nested if's often require braces to
24809
+ # disambiguate the else binding so shhh there.
24810
+ my @semi = ($stmt =~ /;/g);
24811
+ push(@semi, "/**/") if ($stmt =~ m@/\*@);
24812
+ ##print "semi<" . scalar(@semi) . ">\n";
24813
+ if ($lvl == 0 && scalar(@semi) < 2 &&
24814
$stmt !~ /{/ && $stmt !~ /\bif\b/ &&
24815
$before !~ /}/ && $after !~ /{/) {
24816
my $herectx = "$here\n" . join("\n", @control, @block[1 .. $#block]) . "\n";
24817
@@ -1534,11 +1372,6 @@ sub process {
24818
ERROR("inline keyword should sit between storage class and type\n" . $herecurr);
24821
-# Check for __inline__ and __inline, prefer inline
24822
- if ($line =~ /\b(__inline__|__inline)\b/) {
24823
- WARN("plain inline is preferred over $1\n" . $herecurr);
24826
# check for new externs in .c files.
24827
if ($line =~ /^.\s*extern\s/ && ($realfile =~ /\.c$/)) {
24828
WARN("externs should be avoided in .c files\n" . $herecurr);
24829
@@ -1559,33 +1392,21 @@ sub process {
24833
- # In mailback mode only produce a report in the negative, for
24834
- # things that appear to be patches.
24835
- if ($mailback && ($clean == 1 || !$is_patch)) {
24839
- # This is not a patch, and we are are in 'no-patch' mode so
24840
- # just keep quiet.
24841
- if (!$chk_patch && !$is_patch) {
24845
- if (!$is_patch) {
24846
+ if ($chk_patch && !$is_patch) {
24847
ERROR("Does not appear to be a unified-diff format patch\n");
24849
if ($is_patch && $chk_signoff && $signoff == 0) {
24850
ERROR("Missing Signed-off-by: line(s)\n");
24853
- print report_dump();
24855
- print "total: $cnt_error errors, $cnt_warn warnings, " .
24856
- (($check)? "$cnt_chk checks, " : "") .
24857
- "$cnt_lines lines checked\n";
24858
- print "\n" if ($quiet == 0);
24859
+ if ($clean == 0 && ($chk_patch || $is_patch)) {
24860
+ print report_dump();
24861
+ if ($quiet < 2) {
24862
+ print "total: $cnt_error errors, $cnt_warn warnings, " .
24863
+ (($check)? "$cnt_chk checks, " : "") .
24864
+ "$cnt_lines lines checked\n";
24868
if ($clean == 1 && $quiet == 0) {
24869
print "Your patch has no obvious style problems and is ready for submission.\n"
24871
diff --git a/security/commoncap.c b/security/commoncap.c
24872
index 5bc1895..302e8d0 100644
24873
--- a/security/commoncap.c
24874
+++ b/security/commoncap.c
24875
@@ -526,15 +526,6 @@ int cap_task_kill(struct task_struct *p, struct siginfo *info,
24876
if (info != SEND_SIG_NOINFO && (is_si_special(info) || SI_FROMKERNEL(info)))
24880
- * Running a setuid root program raises your capabilities.
24881
- * Killing your own setuid root processes was previously
24883
- * We must preserve legacy signal behavior in this case.
24885
- if (p->euid == 0 && p->uid == current->uid)
24888
/* sigcont is permitted within same session */
24889
if (sig == SIGCONT && (task_session_nr(current) == task_session_nr(p)))