commit fb680c7e5c3697ac5d97a3e1a91e9e878f206d8d Author: Ben Hutchings Date: Thu Dec 19 15:59:05 2019 +0000 Linux 3.16.80 commit 58f675edecc46aec746cfea8407932a4459875dd Author: zhangyi (F) Date: Wed Nov 6 17:43:52 2019 +0800 fs/dcache: move security_d_instantiate() behind attaching dentry to inode During backport 1e2e547a93a "do d_instantiate/unlock_new_inode combinations safely", there was a error instantiating sequence of attaching dentry to inode and calling security_d_instantiate(). Before commit ce23e640133 "->getxattr(): pass dentry and inode as separate arguments" and b96809173e9 "security_d_instantiate(): move to the point prior to attaching dentry to inode", security_d_instantiate() should be called beind __d_instantiate(), otherwise it will trigger below problem when CONFIG_SECURITY_SMACK on ext4 was enabled because d_inode(dentry) used by ->getxattr() is NULL before __d_instantiate() instantiate inode. [ 31.858026] BUG: unable to handle kernel paging request at ffffffffffffff70 ... [ 31.882024] Call Trace: [ 31.882378] [] ext4_xattr_get+0x8c/0x3e0 [ 31.883195] [] ext4_xattr_security_get+0x24/0x40 [ 31.884086] [] generic_getxattr+0x5b/0x90 [ 31.884907] [] smk_fetch+0xb4/0x150 [ 31.885634] [] smack_d_instantiate+0x1c2/0x550 [ 31.886508] [] security_d_instantiate+0x3a/0x80 [ 31.887389] [] d_instantiate_new+0x36/0x130 [ 31.888223] [] ext4_mkdir+0x4af/0x6a0 [ 31.888928] [] vfs_mkdir+0x100/0x280 [ 31.889536] [] SyS_mkdir+0xb6/0x170 [ 31.890255] [] ? trace_do_page_fault+0x95/0x2b0 [ 31.891134] [] entry_SYSCALL_64_fastpath+0x18/0x73 Cc: # 3.16, 4.4 Signed-off-by: zhangyi (F) Signed-off-by: Ben Hutchings commit 79cf668f6a1336c35e2a7e9fb39cbd325d408484 Author: Andrey Ryabinin Date: Thu Nov 21 17:54:01 2019 -0800 mm/ksm.c: don't WARN if page is still mapped in remove_stable_node() commit 9a63236f1ad82d71a98aa80320b6cb618fb32f44 upstream. It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in remove_stable_node() when it races with __mmput() and squeezes in between ksm_exit() and exit_mmap(). WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150 Call Trace: remove_all_stable_nodes+0x12b/0x330 run_store+0x4ef/0x7b0 kernfs_fop_write+0x200/0x420 vfs_write+0x154/0x450 ksys_write+0xf9/0x1d0 do_syscall_64+0x99/0x510 entry_SYSCALL_64_after_hwframe+0x49/0xbe Remove the warning as there is nothing scary going on. Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com Fixes: cbf86cfe04a6 ("ksm: remove old stable nodes more thoroughly") Signed-off-by: Andrey Ryabinin Acked-by: Hugh Dickins Cc: Andrea Arcangeli Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 5d9b8c6e7ab6130e238269dbbe26de0c75724340 Author: Martin Habets Date: Thu Nov 21 17:52:15 2019 +0000 sfc: Only cancel the PPS workqueue if it exists commit 723eb53690041740a13ac78efeaf6804f5d684c9 upstream. The workqueue only exists for the primary PF. For other functions we hit a WARN_ON in kernel/workqueue.c. Fixes: 7c236c43b838 ("sfc: Add support for IEEE-1588 PTP") Signed-off-by: Martin Habets Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit d1dab3b8ec630c032e629da5342d0c6a6b60d58e Author: Davide Caratti Date: Tue Nov 19 23:47:33 2019 +0100 net/sched: act_pedit: fix WARN() in the traffic path commit f67169fef8dbcc1ac6a6a109ecaad0d3b259002c upstream. when configuring act_pedit rules, the number of keys is validated only on addition of a new entry. This is not sufficient to avoid hitting a WARN() in the traffic path: for example, it is possible to replace a valid entry with a new one having 0 extended keys, thus causing splats in dmesg like: pedit BUG: index 42 WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit] [...] RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit] Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9 RSP: 0018:ffff888077c9f320 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209 R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780 R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000 FS: 00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0 Call Trace: tcf_action_exec+0x105/0x3f0 tcf_classify+0xf2/0x410 __dev_queue_xmit+0xcbf/0x2ae0 ip_finish_output2+0x711/0x1fb0 ip_output+0x1bf/0x4b0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x180c/0x2430 sock_sendmsg+0xdb/0x110 __sys_sendto+0x257/0x2b0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fdeeb72e993 Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24 RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993 RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003 RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080 Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that attempts to install rules having 0 keys are always rejected with -EINVAL. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Davide Caratti Signed-off-by: David S. Miller [bwh: Backported to 3.16: - Drop change in tcf_pedit_keys_ex_parse() - netlink doesn't support error messages - Adjust context] Signed-off-by: Ben Hutchings commit dce6338bbbe61aaaffab7730b623effd2cd65399 Author: Laurent Vivier Date: Thu Nov 14 13:25:48 2019 +0100 virtio_console: allocate inbufs in add_port() only if it is needed commit d791cfcbf98191122af70b053a21075cb450d119 upstream. When we hot unplug a virtserialport and then try to hot plug again, it fails: (qemu) chardev-add socket,id=serial0,path=/tmp/serial0,server,nowait (qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\ chardev=serial0,id=serial0,name=serial0 (qemu) device_del serial0 (qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\ chardev=serial0,id=serial0,name=serial0 kernel error: virtio-ports vport2p2: Error allocating inbufs qemu error: virtio-serial-bus: Guest failure in adding port 2 for device \ virtio-serial0.0 This happens because buffers for the in_vq are allocated when the port is added but are not released when the port is unplugged. They are only released when virtconsole is removed (see a7a69ec0d8e4) To avoid the problem and to be symmetric, we could allocate all the buffers in init_vqs() as they are released in remove_vqs(), but it sounds like a waste of memory. Rather than that, this patch changes add_port() logic to ignore ENOSPC error in fill_queue(), which means queue has already been filled. Fixes: a7a69ec0d8e4 ("virtio_console: free buffers after reset") Cc: mst@redhat.com Signed-off-by: Laurent Vivier Signed-off-by: Michael S. Tsirkin Signed-off-by: Ben Hutchings commit bd0469d848a5722d7676d1c88c8a3fd57cf0b44a Author: Roman Gushchin Date: Fri Nov 15 17:34:46 2019 -0800 mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup() commit 0362f326d86c645b5e96b7dbc3ee515986ed019d upstream. An exiting task might belong to an offline cgroup. In this case an attempt to grab a cgroup reference from the task can end up with an infinite loop in hugetlb_cgroup_charge_cgroup(), because neither the cgroup will become online, neither the task will be migrated to a live cgroup. Fix this by switching over to css_tryget(). As css_tryget_online() can't guarantee that the cgroup won't go offline, in most cases the check doesn't make sense. In this particular case users of hugetlb_cgroup_charge_cgroup() are not affected by this change. A similar problem is described by commit 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()"). Link: http://lkml.kernel.org/r/20191106225131.3543616-2-guro@fb.com Signed-off-by: Roman Gushchin Acked-by: Johannes Weiner Acked-by: Tejun Heo Reviewed-by: Shakeel Butt Cc: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 920c7230b861076a1478571445f006c180dc91bf Author: Roman Gushchin Date: Fri Nov 15 17:34:43 2019 -0800 mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm() commit 00d484f354d85845991b40141d40ba9e5eb60faf upstream. We've encountered a rcu stall in get_mem_cgroup_from_mm(): rcu: INFO: rcu_sched self-detected stall on CPU rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017 (t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33 <...> RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90 <...> __memcg_kmem_charge+0x55/0x140 __alloc_pages_nodemask+0x267/0x320 pipe_write+0x1ad/0x400 new_sync_write+0x127/0x1c0 __kernel_write+0x4f/0xf0 dump_emit+0x91/0xc0 writenote+0xa0/0xc0 elf_core_dump+0x11af/0x1430 do_coredump+0xc65/0xee0 get_signal+0x132/0x7c0 do_signal+0x36/0x640 exit_to_usermode_loop+0x61/0xd0 do_syscall_64+0xd4/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The problem is caused by an exiting task which is associated with an offline memcg. We're iterating over and over in the do {} while (!css_tryget_online()) loop, but obviously the memcg won't become online and the exiting task won't be migrated to a live memcg. Let's fix it by switching from css_tryget_online() to css_tryget(). As css_tryget_online() cannot guarantee that the memcg won't go offline, the check is usually useless, except some rare cases when for example it determines if something should be presented to a user. A similar problem is described by commit 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()"). Johannes: : The bug aside, it doesn't matter whether the cgroup is online for the : callers. It used to matter when offlining needed to evacuate all charges : from the memcg, and so needed to prevent new ones from showing up, but we : don't care now. Link: http://lkml.kernel.org/r/20191106225131.3543616-1-guro@fb.com Signed-off-by: Roman Gushchin Acked-by: Johannes Weiner Acked-by: Tejun Heo Reviewed-by: Shakeel Butt Cc: Michal Hocko Cc: Michal Koutn Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 17f104db030b5cc871c3a2977a8023ed7cee62b1 Author: Henry Lin Date: Wed Nov 13 10:14:19 2019 +0800 ALSA: usb-audio: not submit urb for stopped endpoint commit 528699317dd6dc722dccc11b68800cf945109390 upstream. While output urb's snd_complete_urb() is executing, calling prepare_outbound_urb() may cause endpoint stopped before prepare_outbound_urb() returns and result in next urb submitted to stopped endpoint. usb-audio driver cannot re-use it afterwards as the urb is still hold by usb stack. This change checks EP_FLAG_RUNNING flag after prepare_outbound_urb() again to let snd_complete_urb() know the endpoint already stopped and does not submit next urb. Below kind of error will be fixed: [ 213.153103] usb 1-2: timeout: still 1 active urbs on EP #1 [ 213.164121] usb 1-2: cannot submit urb 0, error -16: unknown error Signed-off-by: Henry Lin Link: https://lore.kernel.org/r/20191113021420.13377-1-henryl@nvidia.com Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit ded0d65fd775966a5115c0468dd02eda4cb855a1 Author: Kai-Heng Feng Date: Wed Oct 16 18:38:16 2019 +0800 x86/quirks: Disable HPET on Intel Coffe Lake platforms commit fc5db58539b49351e76f19817ed1102bf7c712d0 upstream. Some Coffee Lake platforms have a skewed HPET timer once the SoCs entered PC10, which in consequence marks TSC as unstable because HPET is used as watchdog clocksource for TSC. Harry Pan tried to work around it in the clocksource watchdog code [1] thereby creating a circular dependency between HPET and TSC. This also ignores the fact, that HPET is not only unsuitable as watchdog clocksource on these systems, it becomes unusable in general. Disable HPET on affected platforms. Suggested-by: Feng Tang Signed-off-by: Kai-Heng Feng Signed-off-by: Thomas Gleixner Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203183 Link: https://lore.kernel.org/lkml/20190516090651.1396-1-harry.pan@intel.com/ [1] Link: https://lkml.kernel.org/r/20191016103816.30650-1-kai.heng.feng@canonical.com Signed-off-by: Ben Hutchings commit 0943b0426b87c025b0168c22ab4e9436bc742e55 Author: Al Viro Date: Sun Nov 3 13:55:43 2019 -0500 ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either commit 762c69685ff7ad5ad7fee0656671e20a0c9c864d upstream. We need to get the underlying dentry of parent; sure, absent the races it is the parent of underlying dentry, but there's nothing to prevent losing a timeslice to preemtion in the middle of evaluation of lower_dentry->d_parent->d_inode, having another process move lower_dentry around and have its (ex)parent not pinned anymore and freed on memory pressure. Then we regain CPU and try to fetch ->d_inode from memory that is freed by that point. dentry->d_parent *is* stable here - it's an argument of ->lookup() and we are guaranteed that it won't be moved anywhere until we feed it to d_add/d_splice_alias. So we safely go that way to get to its underlying dentry. Signed-off-by: Al Viro [bwh: Backported to 3.16: - Open-code d_inode() - Adjust context] Signed-off-by: Ben Hutchings commit 2c25dad35c4b60ae8882ed75aed93e668b755a96 Author: Al Viro Date: Sun Nov 3 13:45:04 2019 -0500 ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable commit e72b9dd6a5f17d0fb51f16f8685f3004361e83d0 upstream. lower_dentry can't go from positive to negative (we have it pinned), but it *can* go from negative to positive. So fetching ->d_inode into a local variable, doing a blocking allocation, checking that now ->d_inode is non-NULL and feeding the value we'd fetched earlier to a function that won't accept NULL is not a good idea. Signed-off-by: Al Viro [bwh: Backported to 3.16: - Use ACCESS_ONCE() instead of READ_ONCE() - Adjust context] Signed-off-by: Ben Hutchings commit 3e0d5bbb04cebc616ca79fd96dc4ed74064d3e87 Author: Takashi Iwai Date: Sat Nov 9 19:16:58 2019 +0100 ALSA: usb-audio: Fix missing error check at mixer resolution test commit 167beb1756791e0806365a3f86a0da10d7a327ee upstream. A check of the return value from get_cur_mix_raw() is missing at the resolution test code in get_min_max_with_quirks(), which may leave the variable untouched, leading to a random uninitialized value, as detected by syzkaller fuzzer. Add the missing return error check for fixing that. Reported-and-tested-by: syzbot+abe1ab7afc62c6bb6377@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20191109181658.30368-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit fbb4995c67ac32830d7c9519150a8623ecf18fdc Author: Dan Carpenter Date: Thu Nov 7 10:48:47 2019 +0300 block: drbd: remove a stray unlock in __drbd_send_protocol() commit 8e9c523016cf9983b295e4bc659183d1fa6ef8e0 upstream. There are two callers of this function and they both unlock the mutex so this ends up being a double unlock. Fixes: 44ed167da748 ("drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf") Signed-off-by: Dan Carpenter Signed-off-by: Jens Axboe Signed-off-by: Ben Hutchings commit f8248efc24768d019f6f782cb16019df68f28876 Author: Alex Deucher Date: Wed Oct 30 10:21:28 2019 -0400 drm/radeon: fix si_enable_smc_cac() failed issue commit 2c409ba81be25516afe05ae27a4a15da01740b01 upstream. Need to set the dte flag on this asic. Port the fix from amdgpu: 5cb818b861be114 ("drm/amd/amdgpu: fix si_enable_smc_cac() failed issue") Reviewed-by: Yong Zhao Signed-off-by: Alex Deucher Signed-off-by: Ben Hutchings commit 44f318899a7727ad8c2edf3c2ccc61b1f9b78acc Author: Kevin Hao Date: Tue Nov 5 21:16:57 2019 -0800 dump_stack: avoid the livelock of the dump_lock commit 5cbf2fff3bba8d3c6a4d47c1754de1cf57e2b01f upstream. In the current code, we use the atomic_cmpxchg() to serialize the output of the dump_stack(), but this implementation suffers the thundering herd problem. We have observed such kind of livelock on a Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in a kprobe handler. Actually we can let the competitors to wait for the releasing of the lock before jumping to atomic_cmpxchg(). This will definitely mitigate the thundering herd problem. Thanks Linus for the suggestion. [akpm@linux-foundation.org: fix comment] Link: http://lkml.kernel.org/r/20191030031637.6025-1-haokexin@gmail.com Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: Kevin Hao Suggested-by: Linus Torvalds Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit ad9b716daea191f7b1773c852df4964bdd86c683 Author: Michal Hocko Date: Tue Nov 5 21:16:40 2019 -0800 mm, vmstat: hide /proc/pagetypeinfo from normal users commit abaed0112c1db08be15a784a2c5c8a8b3063cdd3 upstream. /proc/pagetypeinfo is a debugging tool to examine internal page allocator state wrt to fragmentation. It is not very useful for any other use so normal users really do not need to read this file. Waiman Long has noticed that reading this file can have negative side effects because zone->lock is necessary for gathering data and that a) interferes with the page allocator and its users and b) can lead to hard lockups on large machines which have very long free_list. Reduce both issues by simply not exporting the file to regular users. Link: http://lkml.kernel.org/r/20191025072610.18526-2-mhocko@kernel.org Fixes: 467c996c1e19 ("Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo") Signed-off-by: Michal Hocko Reported-by: Waiman Long Acked-by: Mel Gorman Acked-by: Vlastimil Babka Acked-by: Waiman Long Acked-by: Rafael Aquini Acked-by: David Rientjes Reviewed-by: Andrew Morton Cc: David Hildenbrand Cc: Johannes Weiner Cc: Roman Gushchin Cc: Konstantin Khlebnikov Cc: Jann Horn Cc: Song Liu Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 0f9ec150947d38fac890d9735bd611f13be178e2 Author: Jiri Olsa Date: Tue Nov 5 00:27:11 2019 +0100 perf tools: Fix time sorting commit 722ddfde366fd46205456a9c5ff9b3359dc9a75e upstream. The final sort might get confused when the comparison is done over bigger numbers than int like for -s time. Check the following report for longer workloads: $ perf report -s time -F time,overhead --stdio Fix hist_entry__sort() to properly return int64_t and not possible cut int. Fixes: 043ca389a318 ("perf tools: Use hpp formats to sort final output") Signed-off-by: Jiri Olsa Reviewed-by: Andi Kleen Cc: Alexander Shishkin Cc: Michael Petlan Cc: Namhyung Kim Cc: Peter Zijlstra Link: http://lore.kernel.org/lkml/20191104232711.16055-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Ben Hutchings commit 20d30a0297179eac38a3490b9f07901060af006c Author: Kurt Van Dijck Date: Tue Oct 1 09:40:36 2019 +0200 can: c_can: c_can_poll(): only read status register after status IRQ commit 3cb3eaac52c0f145d895f4b6c22834d5f02b8569 upstream. When the status register is read without the status IRQ pending, the chip may not raise the interrupt line for an upcoming status interrupt and the driver may miss a status interrupt. It is critical that the BUSOFF status interrupt is forwarded to the higher layers, since no more interrupts will follow without intervention. Thanks to Wolfgang and Joe for bringing up the first idea. Signed-off-by: Kurt Van Dijck Cc: Wolfgang Grandegger Cc: Joe Burmeister Fixes: fa39b54ccf28 ("can: c_can: Get rid of pointless interrupts") Signed-off-by: Marc Kleine-Budde Signed-off-by: Ben Hutchings commit fa2012d3cc938812d53bc0b0f84c4afbcacfa9b9 Author: Stephane Grosjean Date: Tue Oct 8 10:35:44 2019 +0200 can: peak_usb: fix a potential out-of-sync while decoding packets commit de280f403f2996679e2607384980703710576fed upstream. When decoding a buffer received from PCAN-USB, the first timestamp read in a packet is a 16-bit coded time base, and the next ones are an 8-bit offset to this base, regardless of the type of packet read. This patch corrects a potential loss of synchronization by using a timestamp index read from the buffer, rather than an index of received data packets, to determine on the sizeof the timestamp to be read from the packet being decoded. Signed-off-by: Stephane Grosjean Fixes: 46be265d3388 ("can: usb: PEAK-System Technik PCAN-USB specific part") Signed-off-by: Marc Kleine-Budde Signed-off-by: Ben Hutchings commit 4e5001f7bcfe4321d9fc0a58d2678edaffbadbb7 Author: Johan Hovold Date: Tue Oct 1 12:29:14 2019 +0200 can: usb_8dev: fix use-after-free on disconnect commit 3759739426186a924675651b388d1c3963c5710e upstream. The driver was accessing its driver data after having freed it. Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices") Cc: Bernd Krumboeck Cc: Wolfgang Grandegger Signed-off-by: Johan Hovold Signed-off-by: Marc Kleine-Budde Signed-off-by: Ben Hutchings commit 181610284de81f64a24592926ea286f2e0866dce Author: Lukas Wunner Date: Thu Oct 31 11:06:24 2019 +0100 netfilter: nf_tables: Align nft_expr private data to 64-bit commit 250367c59e6ba0d79d702a059712d66edacd4a1a upstream. Invoking the following commands on a 32-bit architecture with strict alignment requirements (such as an ARMv7-based Raspberry Pi) results in an alignment exception: # nft add table ip test-ip4 # nft add chain ip test-ip4 output { type filter hook output priority 0; } # nft add rule ip test-ip4 output quota 1025 bytes Alignment trap: not handling instruction e1b26f9f at [<7f4473f8>] Unhandled fault: alignment exception (0x001) at 0xb832e824 Internal error: : 1 [#1] PREEMPT SMP ARM Hardware name: BCM2835 [<7f4473fc>] (nft_quota_do_init [nft_quota]) [<7f447448>] (nft_quota_init [nft_quota]) [<7f4260d0>] (nf_tables_newrule [nf_tables]) [<7f4168dc>] (nfnetlink_rcv_batch [nfnetlink]) [<7f416bd0>] (nfnetlink_rcv [nfnetlink]) [<8078b334>] (netlink_unicast) [<8078b664>] (netlink_sendmsg) [<8071b47c>] (sock_sendmsg) [<8071bd18>] (___sys_sendmsg) [<8071ce3c>] (__sys_sendmsg) [<8071ce94>] (sys_sendmsg) The reason is that nft_quota_do_init() calls atomic64_set() on an atomic64_t which is only aligned to 32-bit, not 64-bit, because it succeeds struct nft_expr in memory which only contains a 32-bit pointer. Fix by aligning the nft_expr private data to 64-bit. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Lukas Wunner Signed-off-by: Pablo Neira Ayuso Signed-off-by: Ben Hutchings commit 2797aec2a5555fb8ea46d3358f8632a87e6c1251 Author: Dan Carpenter Date: Sat Aug 24 17:49:55 2019 +0300 netfilter: ipset: Fix an error code in ip_set_sockfn_get() commit 30b7244d79651460ff114ba8f7987ed94c86b99a upstream. The copy_to_user() function returns the number of bytes remaining to be copied. In this code, that positive return is checked at the end of the function and we return zero/success. What we should do instead is return -EFAULT. Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Dan Carpenter Signed-off-by: Jozsef Kadlecsik Signed-off-by: Ben Hutchings commit 0f1cd1a7518eb06efd2c0473eeac70680971a725 Author: Eric Dumazet Date: Mon Nov 4 07:57:55 2019 -0800 dccp: do not leak jiffies on the wire commit 3d1e5039f5f87a8731202ceca08764ee7cb010d3 upstream. For some reason I missed the case of DCCP passive flows in my previous patch. Fixes: a904a0693c18 ("inet: stop leaking jiffies on the wire") Signed-off-by: Eric Dumazet Reported-by: Thiemo Nagel Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 5c5272d8e1cc6a8280e6d488e3cae1b0e845c5e5 Author: Takashi Sakamoto Date: Sun Nov 3 00:09:20 2019 +0900 ALSA: bebob: fix to detect configured source of sampling clock for Focusrite Saffire Pro i/o series commit 706ad6746a66546daf96d4e4a95e46faf6cf689a upstream. For Focusrite Saffire Pro i/o, the lowest 8 bits of register represents configured source of sampling clock. The next lowest 8 bits represents whether the configured source is actually detected or not just after the register is changed for the source. Current implementation evaluates whole the register to detect configured source. This results in failure due to the next lowest 8 bits when the source is connected in advance. This commit fixes the bug. Fixes: 25784ec2d034 ("ALSA: bebob: Add support for Focusrite Saffire/SaffirePro series") Signed-off-by: Takashi Sakamoto Link: https://lore.kernel.org/r/20191102150920.20367-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit c1fb8e44ba97f8589b7fc5137e4b45561a27dcc5 Author: Eric Dumazet Date: Fri Nov 1 10:32:19 2019 -0700 inet: stop leaking jiffies on the wire commit a904a0693c189691eeee64f6c6b188bd7dc244e9 upstream. Historically linux tried to stick to RFC 791, 1122, 2003 for IPv4 ID field generation. RFC 6864 made clear that no matter how hard we try, we can not ensure unicity of IP ID within maximum lifetime for all datagrams with a given source address/destination address/protocol tuple. Linux uses a per socket inet generator (inet_id), initialized at connection startup with a XOR of 'jiffies' and other fields that appear clear on the wire. Thiemo Nagel pointed that this strategy is a privacy concern as this provides 16 bits of entropy to fingerprint devices. Let's switch to a random starting point, this is just as good as far as RFC 6864 is concerned and does not leak anything critical. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: Thiemo Nagel Signed-off-by: David S. Miller [bwh: Backported to 3.16: drop changes in chelsio] Signed-off-by: Ben Hutchings commit c97d5d87f654979c81a36fbd75844a41e5a82cdc Author: Yihui ZENG Date: Fri Oct 25 12:31:48 2019 +0300 s390/cmm: fix information leak in cmm_timeout_handler() commit b8e51a6a9db94bc1fb18ae831b3dab106b5a4b5f upstream. The problem is that we were putting the NUL terminator too far: buf[sizeof(buf) - 1] = '\0'; If the user input isn't NUL terminated and they haven't initialized the whole buffer then it leads to an info leak. The NUL terminator should be: buf[len - 1] = '\0'; Signed-off-by: Yihui Zeng Signed-off-by: Dan Carpenter [heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled] Signed-off-by: Heiko Carstens Signed-off-by: Vasily Gorbik Signed-off-by: Ben Hutchings commit 55bdc8a20226bb813ffa39acdd8089bd76b60beb Author: Takashi Iwai Date: Wed Oct 30 22:42:57 2019 +0100 ALSA: timer: Fix mutex deadlock at releasing card commit a39331867335d4a94b6165e306265c9e24aca073 upstream. When a card is disconnected while in use, the system waits until all opened files are closed then releases the card. This is done via put_device() of the card device in each device release code. The recently reported mutex deadlock bug happens in this code path; snd_timer_close() for the timer device deals with the global register_mutex and it calls put_device() there. When this timer device is the last one, the card gets freed and it eventually calls snd_timer_free(), which has again the protection with the global register_mutex -- boom. Basically put_device() call itself is race-free, so a relative simple workaround is to move this put_device() call out of the mutex. For achieving that, in this patch, snd_timer_close_locked() got a new argument to store the card device pointer in return, and each caller invokes put_device() with the returned object after the mutex unlock. Reported-and-tested-by: Kirill A. Shutemov Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit 7e625ef1b008551a72e6d579ca272d6cd162e672 Author: Takashi Iwai Date: Wed Nov 6 17:55:47 2019 +0100 ALSA: timer: Fix incorrectly assigned timer instance commit e7af6307a8a54f0b873960b32b6a644f2d0fbd97 upstream. The clean up commit 41672c0c24a6 ("ALSA: timer: Simplify error path in snd_timer_open()") unified the error handling code paths with the standard goto, but it introduced a subtle bug: the timer instance is stored in snd_timer_open() incorrectly even if it returns an error. This may eventually lead to UAF, as spotted by fuzzer. The culprit is the snd_timer_open() code checks the SNDRV_TIMER_IFLG_EXCLUSIVE flag with the common variable timeri. This variable is supposed to be the newly created instance, but we (ab-)used it for a temporary check before the actual creation of a timer instance. After that point, there is another check for the max number of instances, and it bails out if over the threshold. Before the refactoring above, it worked fine because the code returned directly from that point. After the refactoring, however, it jumps to the unified error path that stores the timeri variable in return -- even if it returns an error. Unfortunately this stored value is kept in the caller side (snd_timer_user_tselect()) in tu->timeri. This causes inconsistency later, as if the timer was successfully assigned. In this patch, we fix it by not re-using timeri variable but a temporary variable for testing the exclusive connection, so timeri remains NULL at that point. Fixes: 41672c0c24a6 ("ALSA: timer: Simplify error path in snd_timer_open()") Reported-and-tested-by: Tristan Madani Link: https://lore.kernel.org/r/20191106165547.23518-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit 3e16d31a94e3224ec2d9079bacc93d7eeb6f3164 Author: Takashi Iwai Date: Thu Mar 28 17:11:10 2019 +0100 ALSA: timer: Simplify error path in snd_timer_open() commit 41672c0c24a62699d20aab53b98d843b16483053 upstream. Just a minor refactoring to use the standard goto for error paths in snd_timer_open() instead of open code. The first mutex_lock() is moved to the beginning of the function to make the code clearer. Signed-off-by: Takashi Iwai [bwh: Backported to 3.16 as dependency of commit a39331867335 "ALSA: timer: Fix mutex deadlock at releasing card"] Signed-off-by: Ben Hutchings commit 6c69ce12bc4caf2a3175e86a0393438e7d5380ea Author: Markus Theil Date: Tue Oct 29 10:30:03 2019 +0100 nl80211: fix validation of mesh path nexthop commit 1fab1b89e2e8f01204a9c05a39fd0b6411a48593 upstream. Mesh path nexthop should be a ethernet address, but current validation checks against 4 byte integers. Fixes: 2ec600d672e74 ("nl80211/cfg80211: support for mesh, sta dumping") Signed-off-by: Markus Theil Link: https://lore.kernel.org/r/20191029093003.10355-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 7f79d40d6e883d984c7dbbee58e2e5f7d1a3125b Author: Johan Hovold Date: Tue Oct 29 11:23:54 2019 +0100 USB: serial: whiteheat: fix line-speed endianness commit 84968291d7924261c6a0624b9a72f952398e258b upstream. Add missing endianness conversion when setting the line speed so that this driver might work also on big-endian machines. Also use an unsigned format specifier in the corresponding debug message. Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191029102354.2733-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit c5d931201ff96acb47c5d30f6cdf1e0d8c8530fe Author: Johan Hovold Date: Tue Oct 29 11:23:53 2019 +0100 USB: serial: whiteheat: fix potential slab corruption commit 1251dab9e0a2c4d0d2d48370ba5baa095a5e8774 upstream. Fix a user-controlled slab buffer overflow due to a missing sanity check on the bulk-out transfer buffer used for control requests. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191029102354.2733-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 24b3f59102b33fcfa8963f7daa78d9879c08c9d2 Author: Al Viro Date: Tue Oct 29 13:53:29 2019 +0000 ceph: add missing check in d_revalidate snapdir handling commit 1f08529c84cfecaf1261ed9b7e17fab18541c58f upstream. We should not play with dcache without parent locked... Signed-off-by: Al Viro Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov [bwh: Backported to 3.16: - Test ceph_mds_request::r_locked_dir - Adjust context] Signed-off-by: Ben Hutchings commit 0702f35c5d38f66ac7fb0d247d492818f289951b Author: Luis Henriques Date: Fri Oct 25 14:05:24 2019 +0100 ceph: fix use-after-free in __ceph_remove_cap() commit ea60ed6fcf29eebc78f2ce91491e6309ee005a01 upstream. KASAN reports a use-after-free when running xfstest generic/531, with the following trace: [ 293.903362] kasan_report+0xe/0x20 [ 293.903365] rb_erase+0x1f/0x790 [ 293.903370] __ceph_remove_cap+0x201/0x370 [ 293.903375] __ceph_remove_caps+0x4b/0x70 [ 293.903380] ceph_evict_inode+0x4e/0x360 [ 293.903386] evict+0x169/0x290 [ 293.903390] __dentry_kill+0x16f/0x250 [ 293.903394] dput+0x1c6/0x440 [ 293.903398] __fput+0x184/0x330 [ 293.903404] task_work_run+0xb9/0xe0 [ 293.903410] exit_to_usermode_loop+0xd3/0xe0 [ 293.903413] do_syscall_64+0x1a0/0x1c0 [ 293.903417] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happens because __ceph_remove_cap() may queue a cap release (__ceph_queue_cap_release) which can be scheduled before that cap is removed from the inode list with rb_erase(&cap->ci_node, &ci->i_caps); And, when this finally happens, the use-after-free will occur. This can be fixed by removing the cap from the inode list before being removed from the session list, and thus eliminating the risk of an UAF. Signed-off-by: Luis Henriques Reviewed-by: Jeff Layton Signed-off-by: Ilya Dryomov Signed-off-by: Ben Hutchings commit d7e3f2fe01372eb914d0e451f0e7a46cbcb98f9e Author: Alan Stern Date: Mon Oct 28 10:54:26 2019 -0400 USB: gadget: Reject endpoints with 0 maxpacket value commit 54f83b8c8ea9b22082a496deadf90447a326954e upstream. Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that a UDC will crash or hang when trying to handle a non-zero-length usb_request for such an endpoint. Indeed, dummy-hcd gets a divide error when trying to calculate the remainder of a transfer length by the maxpacket value, as discovered by the syzbot fuzzer. Currently the gadget core does not check for endpoints having a maxpacket value of 0. This patch adds a check to usb_ep_enable(), preventing such endpoints from being used. As far as I know, none of the gadget drivers in the kernel tries to create an endpoint with maxpacket = 0, but until now there has been nothing to prevent userspace programs under gadgetfs or configfs from doing it. Signed-off-by: Alan Stern Reported-and-tested-by: syzbot+8ab8bf161038a8768553@syzkaller.appspotmail.com Acked-by: Felipe Balbi Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281052370.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 690dfe5668bc3173d67b5a1b984baf368182e100 Author: Nicholas Piggin Date: Thu Oct 24 16:38:04 2019 +1000 scsi: qla2xxx: stop timer in shutdown path commit d3566abb1a1e7772116e4d50fb6a58d19c9802e5 upstream. In shutdown/reboot paths, the timer is not stopped: qla2x00_shutdown pci_device_shutdown device_shutdown kernel_restart_prepare kernel_restart sys_reboot This causes lockups (on powerpc) when firmware config space access calls are interrupted by smp_send_stop later in reboot. Fixes: e30d1756480dc ("[SCSI] qla2xxx: Addition of shutdown callback handler.") Link: https://lore.kernel.org/r/20191024063804.14538-1-npiggin@gmail.com Signed-off-by: Nicholas Piggin Acked-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Ben Hutchings commit 5cfbeddb025000363f109dd652aeb95437ef41ed Author: Tejun Heo Date: Thu Oct 24 13:50:27 2019 -0700 net: fix sk_page_frag() recursion from memory reclaim commit 20eb4f29b60286e0d6dc01d9c260b4bd383c58fb upstream. sk_page_frag() optimizes skb_frag allocations by using per-task skb_frag cache when it knows it's the only user. The condition is determined by seeing whether the socket allocation mask allows blocking - if the allocation may block, it obviously owns the task's context and ergo exclusively owns current->task_frag. Unfortunately, this misses recursion through memory reclaim path. Please take a look at the following backtrace. [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10 ... tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 sock_xmit.isra.24+0xa1/0x170 [nbd] nbd_send_cmd+0x1d2/0x690 [nbd] nbd_queue_rq+0x1b5/0x3b0 [nbd] __blk_mq_try_issue_directly+0x108/0x1b0 blk_mq_request_issue_directly+0xbd/0xe0 blk_mq_try_issue_list_directly+0x41/0xb0 blk_mq_sched_insert_requests+0xa2/0xe0 blk_mq_flush_plug_list+0x205/0x2a0 blk_flush_plug_list+0xc3/0xf0 [1] blk_finish_plug+0x21/0x2e _xfs_buf_ioapply+0x313/0x460 __xfs_buf_submit+0x67/0x220 xfs_buf_read_map+0x113/0x1a0 xfs_trans_read_buf_map+0xbf/0x330 xfs_btree_read_buf_block.constprop.42+0x95/0xd0 xfs_btree_lookup_get_block+0x95/0x170 xfs_btree_lookup+0xcc/0x470 xfs_bmap_del_extent_real+0x254/0x9a0 __xfs_bunmapi+0x45c/0xab0 xfs_bunmapi+0x15/0x30 xfs_itruncate_extents_flags+0xca/0x250 xfs_free_eofblocks+0x181/0x1e0 xfs_fs_destroy_inode+0xa8/0x1b0 destroy_inode+0x38/0x70 dispose_list+0x35/0x50 prune_icache_sb+0x52/0x70 super_cache_scan+0x120/0x1a0 do_shrink_slab+0x120/0x290 shrink_slab+0x216/0x2b0 shrink_node+0x1b6/0x4a0 do_try_to_free_pages+0xc6/0x370 try_to_free_mem_cgroup_pages+0xe3/0x1e0 try_charge+0x29e/0x790 mem_cgroup_charge_skmem+0x6a/0x100 __sk_mem_raise_allocated+0x18e/0x390 __sk_mem_schedule+0x2a/0x40 [0] tcp_sendmsg_locked+0x8eb/0xe10 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x26d/0x2b0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 In [0], tcp_send_msg_locked() was using current->page_frag when it called sk_wmem_schedule(). It already calculated how many bytes can be fit into current->page_frag. Due to memory pressure, sk_wmem_schedule() called into memory reclaim path which called into xfs and then IO issue path. Because the filesystem in question is backed by nbd, the control goes back into the tcp layer - back into tcp_sendmsg_locked(). nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes sense - it's in the process of freeing memory and wants to be able to, e.g., drop clean pages to make forward progress. However, this confused sk_page_frag() called from [2]. Because it only tests whether the allocation allows blocking which it does, it now thinks current->page_frag can be used again although it already was being used in [0]. After [2] used current->page_frag, the offset would be increased by the used amount. When the control returns to [0], current->page_frag's offset is increased and the previously calculated number of bytes now may overrun the end of allocated memory leading to silent memory corruptions. Fix it by adding gfpflags_normal_context() which tests sleepable && !reclaim and use it to determine whether to use current->task_frag. v2: Eric didn't like gfp flags being tested twice. Introduce a new helper gfpflags_normal_context() and combine the two tests. Signed-off-by: Tejun Heo Cc: Josef Bacik Cc: Eric Dumazet Signed-off-by: David S. Miller [bwh: Backported to 3.16: Keep testing __GFP_WAIT flag instead of __GFP_DIRECT_RECLAIM.] Signed-off-by: Ben Hutchings commit ee1aa6bdb1ad19683a379b1119c8791c96dcee46 Author: Johan Hovold Date: Tue Oct 22 17:31:27 2019 +0200 USB: ldusb: fix control-message timeout commit 52403cfbc635d28195167618690595013776ebde upstream. USB control-message timeouts are specified in milliseconds, not jiffies. Waiting 83 minutes for a transfer to complete is a bit excessive. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Reported-by: syzbot+a4fbb3bb76cda0ea4e58@syzkaller.appspotmail.com Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191022153127.22295-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 253b043754c296b70a6f44e8cc3298d5dc95ead8 Author: Johan Hovold Date: Tue Oct 22 16:32:02 2019 +0200 USB: ldusb: fix ring-buffer locking commit d98ee2a19c3334e9343df3ce254b496f1fc428eb upstream. The custom ring-buffer implementation was merged without any locking or explicit memory barriers, but a spinlock was later added by commit 9d33efd9a791 ("USB: ldusb bugfix"). The lock did not cover the update of the tail index once the entry had been processed, something which could lead to memory corruption on weakly ordered architectures or due to compiler optimisations. Specifically, a completion handler running on another CPU might observe the incremented tail index and update the entry before ld_usb_read() is done with it. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Fixes: 9d33efd9a791 ("USB: ldusb bugfix") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191022143203.5260-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit d1312e710afc0cf5a5b3a425900e0fbf92739d24 Author: Alexandre Belloni Date: Fri Sep 20 17:39:06 2019 +0200 clk: at91: avoid sleeping early commit 658fd65cf0b0d511de1718e48d9a28844c385ae0 upstream. It is not allowed to sleep to early in the boot process and this may lead to kernel issues if the bootloader didn't prepare the slow clock and main clock. This results in the following error and dump stack on the AriettaG25: bad: scheduling from the idle thread! Ensure it is possible to sleep, else simply have a delay. Reported-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Link: https://lkml.kernel.org/r/20190920153906.20887-1-alexandre.belloni@bootlin.com Fixes: 80eded6ce8bb ("clk: at91: add slow clks driver") Tested-by: Uwe Kleine-König Signed-off-by: Stephen Boyd [bwh: Backported to 3.16: - Drop changes in clk_sama5d4_slow_osc_prepare() - Adjust filename, context] Signed-off-by: Ben Hutchings commit a2a8878701dcdfd52ffc528911a3828c8d60614a Author: Kim Phillips Date: Wed Oct 23 10:09:55 2019 -0500 perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h) commit e431e79b60603079d269e0c2a5177943b95fa4b6 upstream. This saves us writing the IBS control MSR twice when disabling the event. I searched revision guides for all families since 10h, and did not find occurrence of erratum #420, nor anything remotely similar: so we isolate the secondary MSR write to family 10h only. Also unconditionally update the count mask for IBS Op implementations that have read & writeable current count (CurCnt) fields in addition to the MaxCnt field. These bits were reserved on prior implementations, and therefore shouldn't have negative impact. Signed-off-by: Kim Phillips Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Jiri Olsa Cc: Linus Torvalds Cc: Mark Rutland Cc: Namhyung Kim Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: c9574fe0bdb9 ("perf/x86-ibs: Implement workaround for IBS erratum #420") Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar [bwh: Backported to 3.16: - Don't update the count mask; we don't use or define the CurCnt fields here - Adjust filename] Signed-off-by: Ben Hutchings commit 113852441c6f5d644cb122a7302e1a22be99aca8 Author: Kim Phillips Date: Wed Oct 23 10:09:54 2019 -0500 perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity commit 317b96bb14303c7998dbcd5bc606bd8038fdd4b4 upstream. The loop that reads all the IBS MSRs into *buf stopped one MSR short of reading the IbsOpData register, which contains the RipInvalid status bit. Fix the offset_max assignment so the MSR gets read, so the RIP invalid evaluation is based on what the IBS h/w output, instead of what was left in memory. Signed-off-by: Kim Phillips Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Jiri Olsa Cc: Linus Torvalds Cc: Mark Rutland Cc: Namhyung Kim Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: d47e8238cd76 ("perf/x86-ibs: Take instruction pointer from ibs sample") Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings commit a2815101acb07a5ef2d379e04f6e7b084f04fb99 Author: Cristian Birsan Date: Fri Oct 4 20:10:54 2019 +0300 usb: gadget: udc: atmel: Fix interrupt storm in FIFO mode. commit ba3a1a915c49cc3023e4ddfc88f21e7514e82aa4 upstream. Fix interrupt storm generated by endpoints when working in FIFO mode. The TX_COMPLETE interrupt is used only by control endpoints processing. Do not enable it for other types of endpoints. Fixes: 914a3f3b3754 ("USB: add atmel_usba_udc driver") Signed-off-by: Cristian Birsan Signed-off-by: Felipe Balbi Signed-off-by: Ben Hutchings commit aade906f0673af2c846562999de40e6f6eaef4b4 Author: Takashi Sakamoto Date: Sat Oct 26 12:06:20 2019 +0900 ALSA: bebob: Fix prototype of helper function to return negative value commit f2bbdbcb075f3977a53da3bdcb7cd460bc8ae5f2 upstream. A helper function of ALSA bebob driver returns negative value in a function which has a prototype to return unsigned value. This commit fixes it by changing the prototype. Fixes: eb7b3a056cd8 ("ALSA: bebob: Add commands and connections/streams management") Signed-off-by: Takashi Sakamoto Link: https://lore.kernel.org/r/20191026030620.12077-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit f0bc6b7adccdb73775cca8af6843b53b7fa98805 Author: Marek Szyprowski Date: Fri Oct 25 11:02:01 2019 +0200 clk: samsung: exynos5420: Preserve PLL configuration during suspend/resume commit e9323b664ce29547d996195e8a6129a351c39108 upstream. Properly save and restore all top PLL related configuration registers during suspend/resume cycle. So far driver only handled EPLL and RPLL clocks, all other were reset to default values after suspend/resume cycle. This caused for example lower G3D (MALI Panfrost) performance after system resume, even if performance governor has been selected. Reported-by: Reported-by: Marian Mihailescu Fixes: 773424326b51 ("clk: samsung: exynos5420: add more registers to restore list") Signed-off-by: Marek Szyprowski Signed-off-by: Sylwester Nawrocki Signed-off-by: Ben Hutchings commit 7f69044c7ebc1f6efbe443305aa6b136404b162d Author: Taehee Yoo Date: Mon Oct 21 18:47:52 2019 +0000 bonding: fix unexpected IFF_BONDING bit unset commit 65de65d9033750d2cf1b336c9d6e9da3a8b5cc6e upstream. The IFF_BONDING means bonding master or bonding slave device. ->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets IFF_BONDING flag. bond0<--bond1 Both bond0 and bond1 are bonding device and these should keep having IFF_BONDING flag until they are removed. But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine do not check whether the slave device is the bonding type or not. This patch adds the interface type check routine before removing IFF_BONDING flag. Test commands: ip link add bond0 type bond ip link add bond1 type bond ip link set bond1 master bond0 ip link set bond1 nomaster ip link del bond1 type bond ip link add bond1 type bond Splat looks like: [ 226.665555] proc_dir_entry 'bonding/bond1' already registered [ 226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0 [ 226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix [ 226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ #96 [ 226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 226.670652] RIP: 0010:proc_register+0x2a9/0x3e0 [ 226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e 0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00 [ 226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282 [ 226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2 [ 226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c [ 226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019 [ 226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008 [ 226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e [ 226.680758] FS: 00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 226.681886] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0 [ 226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 226.685681] Call Trace: [ 226.687089] proc_create_seq_private+0xb3/0xf0 [ 226.687778] bond_create_proc_entry+0x1b3/0x3f0 [bonding] [ 226.691458] bond_netdev_event+0x433/0x970 [bonding] [ 226.692139] ? __module_text_address+0x13/0x140 [ 226.692779] notifier_call_chain+0x90/0x160 [ 226.693401] register_netdevice+0x9b3/0xd80 [ 226.694010] ? alloc_netdev_mqs+0x854/0xc10 [ 226.694629] ? netdev_change_features+0xa0/0xa0 [ 226.695278] ? rtnl_create_link+0x2ed/0xad0 [ 226.695849] bond_newlink+0x2a/0x60 [bonding] [ 226.696422] __rtnl_newlink+0xb9f/0x11b0 [ 226.696968] ? rtnl_link_unregister+0x220/0x220 [ ... ] Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 74b5e1077f6dbc4cf0e1bead551d3799dc6fba0c Author: Eric Dumazet Date: Wed Oct 23 09:53:03 2019 -0700 ipvs: move old_secure_tcp into struct netns_ipvs commit c24b75e0f9239e78105f81c5f03a751641eb07ef upstream. syzbot reported the following issue : BUG: KCSAN: data-race in update_defense_level / update_defense_level read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1: update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0: update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events defense_work_handler Indeed, old_secure_tcp is currently a static variable, while it needs to be a per netns variable. Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: Simon Horman Signed-off-by: Ben Hutchings commit 41f4d9e0905fd6c1013bb0d4318239cae5d51520 Author: Paul Burton Date: Fri Oct 18 15:38:48 2019 -0700 MIPS: tlbex: Fix build_restore_pagemask KScratch restore commit b42aa3fd5957e4daf4b69129e5ce752a2a53e7d6 upstream. build_restore_pagemask() will restore the value of register $1/$at when its restore_scratch argument is non-zero, and aims to do so by filling a branch delay slot. Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") added an EHB instruction (Execution Hazard Barrier) prior to restoring $1 from a KScratch register, in order to resolve a hazard that can result in stale values of the KScratch register being observed. In particular, P-class CPUs from MIPS with out of order execution pipelines such as the P5600 & P6600 are affected. Unfortunately this EHB instruction was inserted in the branch delay slot causing the MFC0 instruction which performs the restoration to no longer execute along with the branch. The result is that the $1 register isn't actually restored, ie. the TLB refill exception handler clobbers it - which is exactly the problem the EHB is meant to avoid for the P-class CPUs. Similarly build_get_pgd_vmalloc() will restore the value of $1/$at when its mode argument equals refill_scratch, and suffers from the same problem. Fix this by in both cases moving the EHB earlier in the emitted code. There's no reason it needs to immediately precede the MFC0 - it simply needs to be between the MTC0 & MFC0. This bug only affects Cavium Octeon systems which use build_fast_tlb_refill_handler(). Signed-off-by: Paul Burton Fixes: 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") Cc: Dmitry Korotin Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ben Hutchings commit 35054e6a9382f15e63d7d4567cdfb1eab7850b84 Author: Jonas Gorski Date: Tue Oct 22 21:11:00 2019 +0200 MIPS: bmips: mark exception vectors as char arrays commit e4f5cb1a9b27c0f94ef4f5a0178a3fde2d3d0e9e upstream. The vectors span more than one byte, so mark them as arrays. Fixes the following build error when building when using GCC 8.3: In file included from ./include/linux/string.h:19, from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/mips/include/asm/processor.h:15, from ./arch/mips/include/asm/thread_info.h:16, from ./include/linux/thread_info.h:38, from ./include/asm-generic/preempt.h:5, from ./arch/mips/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:81, from ./include/linux/spinlock.h:51, from ./include/linux/mmzone.h:8, from ./include/linux/bootmem.h:8, from arch/mips/bcm63xx/prom.c:10: arch/mips/bcm63xx/prom.c: In function 'prom_init': ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds] __ret = __builtin_memcpy((dst), (src), __len); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy' memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20); ^~~~~~ In file included from arch/mips/bcm63xx/prom.c:14: ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here extern char bmips_smp_movevec; Fixes: 18a1eef92dcd ("MIPS: BMIPS: Introduce bmips.h") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paul Burton Cc: linux-mips@vger.kernel.org Cc: Ralf Baechle Cc: James Hogan Signed-off-by: Ben Hutchings commit 27a2accfafe394d79515d09a8d63b21459a03b75 Author: Russell King Date: Wed Oct 23 14:46:44 2019 +0100 ASoC: kirkwood: fix external clock probe defer commit 4523817d51bc3b2ef38da768d004fda2c8bc41de upstream. When our call to get the external clock fails, we forget to clean up the enabled internal clock correctly. Enable the clock after we have obtained all our resources. Fixes: 84aac6c79bfd ("ASoC: kirkwood: fix loss of external clock at probe time") Signed-off-by: Russell King Link: https://lore.kernel.org/r/E1iNGyK-0004oF-6A@rmk-PC.armlinux.org.uk Signed-off-by: Mark Brown Signed-off-by: Ben Hutchings commit 096d3326951218dc56f120014d6500886b419e94 Author: Miklos Szeredi Date: Wed Oct 23 14:26:37 2019 +0200 fuse: truncate pending writes on O_TRUNC commit e4648309b85a78f8c787457832269a8712a8673e upstream. Make sure cached writes are not reordered around open(..., O_TRUNC), with the obvious wrong results. Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Signed-off-by: Miklos Szeredi [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 9c450e0806911c318fa205cd962cfd775d0e2647 Author: Miklos Szeredi Date: Wed Oct 23 14:26:37 2019 +0200 fuse: flush dirty data/metadata before non-truncate setattr commit b24e7598db62386a95a3c8b9c75630c5d56fe077 upstream. If writeback cache is enabled, then writes might get reordered with chmod/chown/utimes. The problem with this is that performing the write in the fuse daemon might itself change some of these attributes. In such case the following sequence of operations will result in file ending up with the wrong mode, for example: int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL); write (fd, "1", 1); fchown (fd, 0, 0); fchmod (fd, 04755); close (fd); This patch fixes this by flushing pending writes before performing chown/chmod/utimes. Reported-by: Giuseppe Scrivano Tested-by: Giuseppe Scrivano Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Signed-off-by: Miklos Szeredi Signed-off-by: Ben Hutchings commit 3f3e89e96813e4f71f42360ea4e00fcb89ebec29 Author: Daniel Wagner Date: Tue Oct 22 09:21:12 2019 +0200 scsi: lpfc: Honor module parameter lpfc_use_adisc commit 0fd103ccfe6a06e40e2d9d8c91d96332cc9e1239 upstream. The initial lpfc_desc_set_adisc implementation in commit dea3101e0a5c ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE In commit 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3") this changed to (cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE and later in commit ffc954936b13 ("[SCSI] lpfc 8.3.13: FC Discovery Fixes and enhancements.") to (cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET) A customer reports that after a devloss, an ADISC failure is logged. It turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc = 0. [Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa [Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000 [mkp: fixed Hannes' email] Fixes: 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3") Cc: Dick Kennedy Cc: James Smart Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.de Reviewed-by: Hannes Reinecke Reviewed-by: James Smart Signed-off-by: Daniel Wagner Signed-off-by: Martin K. Petersen Signed-off-by: Ben Hutchings commit ac7b20c207fc1c1c6fb00df729b8e7132821e665 Author: Alexey Brodkin Date: Tue Oct 22 17:04:11 2019 +0300 ARC: perf: Accommodate big-endian CPU commit 5effc09c4907901f0e71e68e5f2e14211d9a203f upstream. 8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. And the same order of bytes in the word is used regardless CPU endianness. Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU. Otherwise we're seeing the following error message on boot: ------------------------->8---------------------- ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ------------------------->8---------------------- What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP____" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches Those strings are split in two 32-bit words this way "IJMP" + "____" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "____" & "PMJI" + "___C". And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI". Signed-off-by: Alexey Brodkin Signed-off-by: Vineet Gupta [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 71e17b06e398f7c14078198f0145ac702c05a351 Author: Roberto Bergantinos Corpas Date: Mon Oct 14 10:59:23 2019 +0200 CIFS: avoid using MID 0xFFFF commit 03d9a9fe3f3aec508e485dd3dcfa1e99933b4bdb upstream. According to MS-CIFS specification MID 0xFFFF should not be used by the CIFS client, but we actually do. Besides, this has proven to cause races leading to oops between SendReceive2/cifs_demultiplex_thread. On SMB1, MID is a 2 byte value easy to reach in CurrentMid which may conflict with an oplock break notification request coming from server Signed-off-by: Roberto Bergantinos Corpas Reviewed-by: Ronnie Sahlberg Reviewed-by: Aurelien Aptel Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit 369851be561fe115c026df020e6beb2bc01f27db Author: Jakub Kicinski Date: Fri Oct 18 09:16:58 2019 -0700 net: netem: correct the parent's backlog when corrupted packet was dropped commit e0ad032e144731a5928f2d75e91c2064ba1a764c upstream. If packet corruption failed we jump to finish_segs and return NET_XMIT_SUCCESS. Seeing success will make the parent qdisc increment its backlog, that's incorrect - we need to return NET_XMIT_DROP. Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Jakub Kicinski Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 4f1c4a3446654e79dad3dcfe5a984c984c70316d Author: Juergen Gross Date: Fri Oct 18 09:45:49 2019 +0200 xen/netback: fix error path of xenvif_connect_data() commit 3d5c1a037d37392a6859afbde49be5ba6a70a6b3 upstream. xenvif_connect_data() calls module_put() in case of error. This is wrong as there is no related module_get(). Remove the superfluous module_put(). Fixes: 279f438e36c0a7 ("xen-netback: Don't destroy the netdev until the vif is shut down") Signed-off-by: Juergen Gross Reviewed-by: Paul Durrant Reviewed-by: Wei Liu Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit e57630148a7f62c47513d9c245f0306d46c43124 Author: Johan Hovold Date: Fri Oct 18 17:19:54 2019 +0200 USB: ldusb: fix read info leaks commit 7a6f22d7479b7a0b68eadd308a997dd64dda7dae upstream. Fix broken read implementation, which could be used to trigger slab info leaks. The driver failed to check if the custom ring buffer was still empty when waking up after having waited for more data. This would happen on every interrupt-in completion, even if no data had been added to the ring buffer (e.g. on disconnect events). Due to missing sanity checks and uninitialised (kmalloced) ring-buffer entries, this meant that huge slab info leaks could easily be triggered. Note that the empty-buffer check after wakeup is enough to fix the info leak on disconnect, but let's clear the buffer on allocation and add a sanity check to read() to prevent further leaks. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Reported-by: syzbot+6fe95b826644f7f12b0b@syzkaller.appspotmail.com Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191018151955.25135-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 6a6055236ee213d71e334fbc0bc86ecb1f3d01c1 Author: Doug Berger Date: Wed Oct 16 16:06:32 2019 -0700 net: bcmgenet: reset 40nm EPHY on energy detect commit 25382b991d252aed961cd434176240f9de6bb15f upstream. The EPHY integrated into the 40nm Set-Top Box devices can falsely detect energy when connected to a disabled peer interface. When the peer interface is enabled the EPHY will detect and report the link as active, but on occasion may get into a state where it is not able to exchange data with the connected GENET MAC. This issue has not been observed when the link parameters are auto-negotiated; however, it has been observed with a manually configured link. It has been empirically determined that issuing a soft reset to the EPHY when energy is detected prevents it from getting into this bad state. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger Acked-by: Florian Fainelli Signed-off-by: David S. Miller [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 07bc7073ae0339e3e24b19f319e0d1902ea42a9c Author: Doug Berger Date: Wed Oct 16 16:06:30 2019 -0700 net: phy: bcm7xxx: define soft_reset for 40nm EPHY commit fe586b823372a9f43f90e2c6aa0573992ce7ccb7 upstream. The internal 40nm EPHYs use a "Workaround for putting the PHY in IDDQ mode." These PHYs require a soft reset to restore functionality after they are powered back up. This commit defines the soft_reset function to use genphy_soft_reset during phy_init_hw to accommodate this. Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset") Signed-off-by: Doug Berger Acked-by: Florian Fainelli Signed-off-by: David S. Miller [bwh: Backported to 3.16: - Delete trailing backslash; there is a single entry for 40 nm PHYs and not a macro definition - Adjust context] Signed-off-by: Ben Hutchings commit 76e7958dc9c8c35ce89192ced1a76353218c940a Author: Yufen Yu Date: Tue Oct 15 21:05:56 2019 +0800 scsi: core: try to get module before removing device commit 77c301287ebae86cc71d03eb3806f271cb14da79 upstream. We have a test case like block/001 in blktests, which will create a scsi device by loading scsi_debug module and then try to delete the device by sysfs interface. At the same time, it may remove the scsi_debug module. And getting a invalid paging request BUG_ON as following: [ 34.625854] BUG: unable to handle page fault for address: ffffffffa0016bb8 [ 34.629189] Oops: 0000 [#1] SMP PTI [ 34.629618] CPU: 1 PID: 450 Comm: bash Tainted: G W 5.4.0-rc3+ #473 [ 34.632524] RIP: 0010:scsi_proc_hostdir_rm+0x5/0xa0 [ 34.643555] CR2: ffffffffa0016bb8 CR3: 000000012cd88000 CR4: 00000000000006e0 [ 34.644545] Call Trace: [ 34.644907] scsi_host_dev_release+0x6b/0x1f0 [ 34.645511] device_release+0x74/0x110 [ 34.646046] kobject_put+0x116/0x390 [ 34.646559] put_device+0x17/0x30 [ 34.647041] scsi_target_dev_release+0x2b/0x40 [ 34.647652] device_release+0x74/0x110 [ 34.648186] kobject_put+0x116/0x390 [ 34.648691] put_device+0x17/0x30 [ 34.649157] scsi_device_dev_release_usercontext+0x2e8/0x360 [ 34.649953] execute_in_process_context+0x29/0x80 [ 34.650603] scsi_device_dev_release+0x20/0x30 [ 34.651221] device_release+0x74/0x110 [ 34.651732] kobject_put+0x116/0x390 [ 34.652230] sysfs_unbreak_active_protection+0x3f/0x50 [ 34.652935] sdev_store_delete.cold.4+0x71/0x8f [ 34.653579] dev_attr_store+0x1b/0x40 [ 34.654103] sysfs_kf_write+0x3d/0x60 [ 34.654603] kernfs_fop_write+0x174/0x250 [ 34.655165] __vfs_write+0x1f/0x60 [ 34.655639] vfs_write+0xc7/0x280 [ 34.656117] ksys_write+0x6d/0x140 [ 34.656591] __x64_sys_write+0x1e/0x30 [ 34.657114] do_syscall_64+0xb1/0x400 [ 34.657627] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 34.658335] RIP: 0033:0x7f156f337130 During deleting scsi target, the scsi_debug module have been removed. Then, sdebug_driver_template belonged to the module cannot be accessd, resulting in scsi_proc_hostdir_rm() BUG_ON. To fix the bug, we add scsi_device_get() in sdev_store_delete() to try to increase refcount of module, avoiding the module been removed. Link: https://lore.kernel.org/r/20191015130556.18061-1-yuyufen@huawei.com Signed-off-by: Yufen Yu Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Ben Hutchings commit 6a80b5382babec0a0aced7169b27c2fe5212b019 Author: Filipe Manana Date: Wed Oct 16 16:28:52 2019 +0100 Btrfs: check for the full sync flag while holding the inode lock during fsync commit ba0b084ac309283db6e329785c1dc4f45fdbd379 upstream. We were checking for the full fsync flag in the inode before locking the inode, which is racy, since at that that time it might not be set but after we acquire the inode lock some other task set it. One case where this can happen is on a system low on memory and some concurrent task failed to allocate an extent map and therefore set the full sync flag on the inode, to force the next fsync to work in full mode. A consequence of missing the full fsync flag set is hitting the problems fixed by commit 0c713cbab620 ("Btrfs: fix race between ranged fsync and writeback of adjacent ranges"), BUG_ON() when dropping extents from a log tree, hitting assertion failures at tree-log.c:copy_items() or all sorts of weird inconsistencies after replaying a log due to file extents items representing ranges that overlap. So just move the check such that it's done after locking the inode and before starting writeback again. Fixes: 0c713cbab620 ("Btrfs: fix race between ranged fsync and writeback of adjacent ranges") Signed-off-by: Filipe Manana Signed-off-by: David Sterba [bwh: Backported to 3.16: - Keep the "len" variable since we pass it to btrfs_wait_ordered_range() in two different places here - Adjust context] Signed-off-by: Ben Hutchings commit 848cf94ec9943ecb4fc8437466753cd0cd62f515 Author: Johan Hovold Date: Fri Oct 11 11:57:35 2019 +0200 USB: serial: ti_usb_3410_5052: fix port-close races commit 6f1d1dc8d540a9aa6e39b9cb86d3a67bbc1c8d8d upstream. Fix races between closing a port and opening or closing another port on the same device which could lead to a failure to start or stop the shared interrupt URB. The latter could potentially cause a use-after-free or worse in the completion handler on driver unbind. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Signed-off-by: Ben Hutchings commit b2f07f4986d56f1d7c2f3351d3c76e2343740d77 Author: Florian Fainelli Date: Tue Oct 15 10:45:47 2019 -0700 net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3 commit efb86fede98cdc70b674692ff617b1162f642c49 upstream. The RGMII_MODE_EN bit value was 0 for GENET versions 1 through 3, and became 6 for GENET v4 and above, account for that difference. Fixes: aa09677cba42 ("net: bcmgenet: add MDIO routines") Signed-off-by: Florian Fainelli Acked-by: Doug Berger Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 85e4c4176d83e7a8685d3e8626546ce042f60cdc Author: Eric Dumazet Date: Mon Oct 14 11:22:30 2019 -0700 net: avoid potential infinite loop in tc_ctl_action() commit 39f13ea2f61b439ebe0060393e9c39925c9ee28c upstream. tc_ctl_action() has the ability to loop forever if tcf_action_add() returns -EAGAIN. This special case has been done in case a module needed to be loaded, but it turns out that tcf_add_notify() could also return -EAGAIN if the socket sk_rcvbuf limit is hit. We need to separate the two cases, and only loop for the module loading case. While we are at it, add a limit of 10 attempts since unbounded loops are always scary. syzbot repro was something like : socket(PF_NETLINK, SOCK_RAW|SOCK_NONBLOCK, NETLINK_ROUTE) = 3 write(3, ..., 38) = 38 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{..., 388}], msg_controllen=0, msg_flags=0x10}, ...) NMI backtrace for cpu 0 CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline] watchdog+0x9d0/0xef0 kernel/hung_task.c:289 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 8859 Comm: syz-executor910 Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:751 [inline] RIP: 0010:lockdep_hardirqs_off+0x1df/0x2e0 kernel/locking/lockdep.c:3453 Code: 5c 08 00 00 5b 41 5c 41 5d 5d c3 48 c7 c0 58 1d f3 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d3 00 00 00 <48> 83 3d 21 9e 99 07 00 0f 84 b9 00 00 00 9c 58 0f 1f 44 00 00 f6 RSP: 0018:ffff8880a6f3f1b8 EFLAGS: 00000046 RAX: 1ffffffff11e63ab RBX: ffff88808c9c6080 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff88808c9c6914 RBP: ffff8880a6f3f1d0 R08: ffff88808c9c6080 R09: fffffbfff16be5d1 R10: fffffbfff16be5d0 R11: 0000000000000003 R12: ffffffff8746591f R13: ffff88808c9c6080 R14: ffffffff8746591f R15: 0000000000000003 FS: 00000000011e4880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffff600400 CR3: 00000000a8920000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: trace_hardirqs_off+0x62/0x240 kernel/trace/trace_preemptirq.c:45 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline] _raw_spin_lock_irqsave+0x6f/0xcd kernel/locking/spinlock.c:159 __wake_up_common_lock+0xc8/0x150 kernel/sched/wait.c:122 __wake_up+0xe/0x10 kernel/sched/wait.c:142 netlink_unlock_table net/netlink/af_netlink.c:466 [inline] netlink_unlock_table net/netlink/af_netlink.c:463 [inline] netlink_broadcast_filtered+0x705/0xb80 net/netlink/af_netlink.c:1514 netlink_broadcast+0x3a/0x50 net/netlink/af_netlink.c:1534 rtnetlink_send+0xdd/0x110 net/core/rtnetlink.c:714 tcf_add_notify net/sched/act_api.c:1343 [inline] tcf_action_add+0x243/0x370 net/sched/act_api.c:1362 tc_ctl_action+0x3b5/0x4bc net/sched/act_api.c:1410 rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5386 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5404 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x803/0x920 net/socket.c:2311 __sys_sendmsg+0x105/0x1d0 net/socket.c:2356 __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg net/socket.c:2363 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440939 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: syzbot+cf0adbb9c28c8866c788@syzkaller.appspotmail.com Signed-off-by: David S. Miller [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit c8d00a18eba1cdd3782d07442b6cf83874a1ad35 Author: Rafael J. Wysocki Date: Mon Oct 14 13:25:00 2019 +0200 PCI: PM: Fix pci_power_up() commit 45144d42f299455911cc29366656c7324a3a7c97 upstream. There is an arbitrary difference between the system resume and runtime resume code paths for PCI devices regarding the delay to apply when switching the devices from D3cold to D0. Namely, pci_restore_standard_config() used in the runtime resume code path calls pci_set_power_state() which in turn invokes __pci_start_power_transition() to power up the device through the platform firmware and that function applies the transition delay (as per PCI Express Base Specification Revision 2.0, Section 6.6.1). However, pci_pm_default_resume_early() used in the system resume code path calls pci_power_up() which doesn't apply the delay at all and that causes issues to occur during resume from suspend-to-idle on some systems where the delay is required. Since there is no reason for that difference to exist, modify pci_power_up() to follow pci_set_power_state() more closely and invoke __pci_start_power_transition() from there to call the platform firmware to power up the device (in case that's necessary). Fixes: db288c9c5f9d ("PCI / PM: restore the original behavior of pci_set_power_state()") Reported-by: Daniel Drake Tested-by: Daniel Drake Link: https://lore.kernel.org/linux-pm/CAD8Lp44TYxrMgPLkHCqF9hv6smEurMXvmmvmtyFhZ6Q4SE+dig@mail.gmail.com/T/#m21be74af263c6a34f36e0fc5c77c5449d9406925 Signed-off-by: Rafael J. Wysocki Acked-by: Bjorn Helgaas Signed-off-by: Ben Hutchings commit b98c86abf7eaf87ff3207c90856e11fe7d97b717 Author: Johan Hovold Date: Tue Oct 15 19:55:22 2019 +0200 USB: usblp: fix use-after-free on disconnect commit 7a759197974894213621aa65f0571b51904733d6 upstream. A recent commit addressing a runtime PM use-count regression, introduced a use-after-free by not making sure we held a reference to the struct usb_interface for the lifetime of the driver data. Fixes: 9a31535859bf ("USB: usblp: fix runtime PM after driver unbind") Reported-by: syzbot+cd24df4d075c319ebfc5@syzkaller.appspotmail.com Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191015175522.18490-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 8cd7ec48575c23213a1ed1798c4afd928ac930dc Author: Gustavo A. R. Silva Date: Mon Oct 14 14:18:30 2019 -0500 usb: udc: lpc32xx: fix bad bit shift operation commit b987b66ac3a2bc2f7b03a0ba48a07dc553100c07 upstream. It seems that the right variable to use in this case is *i*, instead of *n*, otherwise there is an undefined behavior when right shifiting by more than 31 bits when multiplying n by 8; notice that *n* can take values equal or greater than 4 (4, 8, 16, ...). Also, notice that under the current conditions (bl = 3), we are skiping the handling of bytes 3, 7, 31... So, fix this by updating this logic and limit *bl* up to 4 instead of up to 3. This fix is based on function udc_stuff_fifo(). Addresses-Coverity-ID: 1454834 ("Bad bit shift operation") Fixes: 24a28e428351 ("USB: gadget driver for LPC32xx") Signed-off-by: Gustavo A. R. Silva Link: https://lore.kernel.org/r/20191014191830.GA10721@embeddedor Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings commit c7ca3ea798c10f8d63754ce123a4b4ce2a702cf6 Author: Dan Carpenter Date: Fri Oct 11 17:11:15 2019 +0300 USB: legousbtower: fix a signedness bug in tower_probe() commit fd47a417e75e2506eb3672ae569b1c87e3774155 upstream. The problem is that sizeof() is unsigned long so negative error codes are type promoted to high positive values and the condition becomes false. Fixes: 1d427be4a39d ("USB: legousbtower: fix slab info leak at probe") Signed-off-by: Dan Carpenter Acked-by: Johan Hovold Link: https://lore.kernel.org/r/20191011141115.GA4521@mwanda Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit bf3c41fc7b174872f2dcc32785415307f151a8c2 Author: Johan Hovold Date: Thu Oct 10 14:58:35 2019 +0200 USB: legousbtower: fix memleak on disconnect commit b6c03e5f7b463efcafd1ce141bd5a8fc4e583ae2 upstream. If disconnect() races with release() after a process has been interrupted, release() could end up returning early and the driver would fail to free its driver data. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191010125835.27031-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 35ee93b667ce52ec7da24a3e539c9fa128dc74a8 Author: Johan Hovold Date: Thu Oct 10 14:58:34 2019 +0200 USB: ldusb: fix memleak on disconnect commit b14a39048c1156cfee76228bf449852da2f14df8 upstream. If disconnect() races with release() after a process has been interrupted, release() could end up returning early and the driver would fail to free its driver data. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191010125835.27031-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit b66be89569dd5b233622edd83ccf291d8cefc13d Author: Jeff Layton Date: Thu Sep 26 16:05:11 2019 -0400 ceph: just skip unrecognized info in ceph_reply_info_extra commit 1d3f87233e26362fc3d4e59f0f31a71b570f90b9 upstream. In the future, we're going to want to extend the ceph_reply_info_extra for create replies. Currently though, the kernel code doesn't accept an extra blob that is larger than the expected data. Change the code to skip over any unrecognized fields at the end of the extra blob, rather than returning -EIO. Signed-off-by: Jeff Layton Signed-off-by: Ilya Dryomov [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit e19a0f181e9702a55f6b12db58abd2a649e141f4 Author: Max Filippov Date: Mon Oct 14 15:48:19 2019 -0700 xtensa: drop EXPORT_SYMBOL for outs*/ins* commit 8b39da985194aac2998dd9e3a22d00b596cebf1e upstream. Custom outs*/ins* implementations are long gone from the xtensa port, remove matching EXPORT_SYMBOLs. This fixes the following build warnings issued by modpost since commit 15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL* functions"): WARNING: "insb" [vmlinux] is a static EXPORT_SYMBOL WARNING: "insw" [vmlinux] is a static EXPORT_SYMBOL WARNING: "insl" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsb" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsw" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsl" [vmlinux] is a static EXPORT_SYMBOL Fixes: d38efc1f150f ("xtensa: adopt generic io routines") Signed-off-by: Max Filippov Signed-off-by: Ben Hutchings commit 42cd411240e1764a3caf7057a5d2ec70026311ac Author: Qian Cai Date: Mon Oct 14 14:11:51 2019 -0700 mm/slub: fix a deadlock in show_slab_objects() commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream. A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e25f7f ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 #2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e25f7f ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai Acked-by: Michal Hocko Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Tejun Heo Cc: Vladimir Davydov Cc: Roman Gushchin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 398a4d5254d5ceb7687b8b4ae7fd42fc0cb0fcf4 Author: Helge Deller Date: Fri Oct 4 19:23:37 2019 +0200 parisc: Fix vmap memory leak in ioremap()/iounmap() commit 513f7f747e1cba81f28a436911fba0b485878ebd upstream. Sven noticed that calling ioremap() and iounmap() multiple times leads to a vmap memory leak: vmap allocation for size 4198400 failed: use vmalloc= to increase size It seems we missed calling vunmap() in iounmap(). Signed-off-by: Helge Deller Noticed-by: Sven Schnelle Signed-off-by: Ben Hutchings commit 6f2088aaa4bf8df9c35c92dbc9fe381f49b8f62a Author: Sven Eckelmann Date: Thu Oct 3 17:02:01 2019 +0200 batman-adv: Avoid free/alloc race when handling OGM buffer commit 40e220b4218bb3d278e5e8cc04ccdfd1c7ff8307 upstream. Each slave interface of an B.A.T.M.A.N. IV virtual interface has an OGM packet buffer which is initialized using data from netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers. It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified/freed data is sent out or functions modifying the OGM buffer try to access already freed memory regions. Reported-by: syzbot+0cc629f19ccb8534935b@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich [bwh: Backported to 3.16: - Don't add status check in batadv_iv_ogm_schedule() - Adjust context] Signed-off-by: Ben Hutchings commit bc1e75e2f759973eb81dd09757952f1a2f95e607 Author: Markus Pargmann Date: Fri Dec 26 12:41:26 2014 +0100 batman-adv: iv_ogm_iface_enable, direct return values commit 42d9f2cbd42e1ba137584da8305cf6f68b84f39d upstream. Directly return error values. No need to use a return variable. Signed-off-by: Markus Pargmann Signed-off-by: Marek Lindner [bwh: Backported to 3.16 as dependency of commit 40e220b4218b "batman-adv: Avoid free/alloc race when handling OGM buffer"] Signed-off-by: Ben Hutchings commit a2cae155ba4117151ea01adeee4138777a108376 Author: Steven Rostedt (VMware) Date: Fri Oct 11 18:19:17 2019 -0400 tracing: Get trace_array reference for available_tracers files commit 194c2c74f5532e62c218adeb8e2b683119503907 upstream. As instances may have different tracers available, we need to look at the trace_array descriptor that shows the list of the available tracers for the instance. But there's a race between opening the file and an admin deleting the instance. The trace_array_get() needs to be called before accessing the trace_array. Fixes: 607e2ea167e56 ("tracing: Set up infrastructure to allow tracers for instances") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Ben Hutchings commit 091362b93ff20246ffe74278937699c4a53a4329 Author: Johan Hovold Date: Wed Oct 9 17:38:48 2019 +0200 USB: yurex: fix NULL-derefs on disconnect commit aafb00a977cf7d81821f7c9d12e04c558c22dc3c upstream. The driver was using its struct usb_interface pointer as an inverted disconnected flag, but was setting it to NULL without making sure all code paths that used it were done with it. Before commit ef61eb43ada6 ("USB: yurex: Fix protection fault after device removal") this included the interrupt-in completion handler, but there are further accesses in dev_err and dev_dbg statements in yurex_write() and the driver-data destructor (sic!). Fix this by unconditionally stopping also the control URB at disconnect and by using a dedicated disconnected flag. Note that we need to take a reference to the struct usb_interface to avoid a use-after-free in the destructor whenever the device was disconnected while the character device was still open. Fixes: aadd6472d904 ("USB: yurex.c: remove dbg() usage") Fixes: 45714104b9e8 ("USB: yurex.c: remove err() usage") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009153848.8664-6-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 1872d33bbdc54baabe223e3ca5b0535415ac3cf8 Author: Johan Hovold Date: Wed Oct 9 12:48:43 2019 +0200 USB: iowarrior: fix use-after-free after driver unbind commit b5f8d46867ca233d773408ffbe691a8062ed718f upstream. Make sure to stop also the asynchronous write URBs on disconnect() to avoid use-after-free in the completion handler after driver unbind. Fixes: 946b960d13c1 ("USB: add driver for iowarrior devices.") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009104846.5925-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 57418465451c10a13637746f922d2c91ac174dba Author: Johan Hovold Date: Wed Oct 9 12:48:42 2019 +0200 USB: iowarrior: fix use-after-free on release commit 80cd5479b525093a56ef768553045741af61b250 upstream. The driver was accessing its struct usb_interface from its release() callback without holding a reference. This would lead to a use-after-free whenever debugging was enabled and the device was disconnected while its character device was open. Fixes: 549e83500b80 ("USB: iowarrior: Convert local dbg macro to dev_dbg") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009104846.5925-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 425be35ac0ff315a9398025bbe2057c1e9f2e507 Author: Johan Hovold Date: Wed Oct 9 17:38:44 2019 +0200 USB: adutux: fix use-after-free on release commit 123a0f125fa3d2104043697baa62899d9e549272 upstream. The driver was accessing its struct usb_device in its release() callback without holding a reference. This would lead to a use-after-free whenever the device was disconnected while the character device was still open. Fixes: 66d4bc30d128 ("USB: adutux: remove custom debug macro") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009153848.8664-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 8ed30998bd7c341476334a0457440679206cfbb9 Author: Johan Hovold Date: Wed Oct 9 17:38:46 2019 +0200 USB: ldusb: fix NULL-derefs on driver unbind commit 58ecf131e74620305175a7aa103f81350bb37570 upstream. The driver was using its struct usb_interface pointer as an inverted disconnected flag, but was setting it to NULL before making sure all completion handlers had run. This could lead to a NULL-pointer dereference in a number of dev_dbg, dev_warn and dev_err statements in the completion handlers which relies on said pointer. Fix this by unconditionally stopping all I/O and preventing resubmissions by poisoning the interrupt URBs at disconnect and using a dedicated disconnected flag. This also makes sure that all I/O has completed by the time the disconnect callback returns. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009153848.8664-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 0d2775be1c056a72fe1da204bc121b88410f432a Author: Johan Hovold Date: Wed Oct 9 17:38:47 2019 +0200 USB: legousbtower: fix use-after-free on release commit 726b55d0e22ca72c69c947af87785c830289ddbc upstream. The driver was accessing its struct usb_device in its release() callback without holding a reference. This would lead to a use-after-free whenever the device was disconnected while the character device was still open. Fixes: fef526cae700 ("USB: legousbtower: remove custom debug macro") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009153848.8664-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 90461e68ee390441684ef9d28feeaa5e34c3863c Author: Johan Hovold Date: Wed Oct 9 19:09:42 2019 +0200 USB: usb-skeleton: fix NULL-deref on disconnect commit bed5ef230943863b9abf5eae226a20fad9a8ff71 upstream. The driver was using its struct usb_interface pointer as an inverted disconnected flag and was setting it to NULL before making sure all completion handlers had run. This could lead to NULL-pointer dereferences in the dev_err() statements in the completion handlers which relies on said pointer. Fix this by using a dedicated disconnected flag. Note that this is also addresses a NULL-pointer dereference at release() and a struct usb_interface reference leak introduced by a recent runtime PM fix, which depends on and should have been submitted together with this patch. Fixes: 4212cd74ca6f ("USB: usb-skeleton.c: remove err() usage") Fixes: 5c290a5e42c3 ("USB: usb-skeleton: fix runtime PM after driver unbind") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191009170944.30057-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 52894b441da146d4165e58b44d15d78c872d86ac Author: Russell King Date: Sat Aug 31 17:01:58 2019 +0100 ARM: mm: fix alignment handler faults under memory pressure commit 67e15fa5b487adb9b78a92789eeff2d6ec8f5cee upstream. When the system has high memory pressure, the page containing the instruction may be paged out. Using probe_kernel_address() means that if the page is swapped out, the resulting page fault will not be handled because page faults are disabled by this function. Use get_user() to read the instruction instead. Reported-by: Jing Xiangfeng Fixes: b255188f90e2 ("ARM: fix scheduling while atomic warning in alignment handling code") Signed-off-by: Russell King [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit db1525999aeaf1bbc463cee815c2669725bc01c9 Author: Xuewei Zhang Date: Thu Oct 3 17:12:43 2019 -0700 sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision commit 4929a4e6faa0f13289a67cae98139e727f0d4a97 upstream. The quota/period ratio is used to ensure a child task group won't get more bandwidth than the parent task group, and is calculated as: normalized_cfs_quota() = [(quota_us << 20) / period_us] If the quota/period ratio was changed during this scaling due to precision loss, it will cause inconsistency between parent and child task groups. See below example: A userspace container manager (kubelet) does three operations: 1) Create a parent cgroup, set quota to 1,000us and period to 10,000us. 2) Create a few children cgroups. 3) Set quota to 1,000us and period to 10,000us on a child cgroup. These operations are expected to succeed. However, if the scaling of 147/128 happens before step 3, quota and period of the parent cgroup will be changed: new_quota: 1148437ns, 1148us new_period: 11484375ns, 11484us And when step 3 comes in, the ratio of the child cgroup will be 104857, which will be larger than the parent cgroup ratio (104821), and will fail. Scaling them by a factor of 2 will fix the problem. Tested-by: Phil Auld Signed-off-by: Xuewei Zhang Signed-off-by: Peter Zijlstra (Intel) Acked-by: Phil Auld Cc: Anton Blanchard Cc: Ben Segall Cc: Dietmar Eggemann Cc: Juri Lelli Cc: Linus Torvalds Cc: Mel Gorman Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Vincent Guittot Fixes: 2e8e19226398 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup") Link: https://lkml.kernel.org/r/20191004001243.140897-1-xueweiz@google.com Signed-off-by: Ingo Molnar Signed-off-by: Ben Hutchings commit 73b3f5c951a7f0e127854c8b6f0f79d5b21e9f14 Author: Christophe JAILLET Date: Sat Oct 5 13:21:01 2019 +0200 memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()' commit 28c9fac09ab0147158db0baeec630407a5e9b892 upstream. If 'jmb38x_ms_count_slots()' returns 0, we must undo the previous 'pci_request_regions()' call. Goto 'err_out_int' to fix it. Fixes: 60fdd931d577 ("memstick: add support for JMicron jmb38x MemoryStick host controller") Signed-off-by: Christophe JAILLET Signed-off-by: Ulf Hansson Signed-off-by: Ben Hutchings commit 3aa64a2ac703f3957d5b5ad716fa24a8ad2d63bf Author: Pavel Shilovsky Date: Mon Sep 30 10:06:20 2019 -0700 CIFS: Force reval dentry if LOOKUP_REVAL flag is set commit 0b3d0ef9840f7be202393ca9116b857f6f793715 upstream. Mark inode for force revalidation if LOOKUP_REVAL flag is set. This tells the client to actually send a QueryInfo request to the server to obtain the latest metadata in case a directory or a file were changed remotely. Only do that if the client doesn't have a lease for the file to avoid unneeded round trips to the server. Signed-off-by: Pavel Shilovsky Signed-off-by: Steve French [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 73524a840bd68bb0e3c45b07a7d5efbc1ccc736f Author: Pavel Shilovsky Date: Mon Sep 30 10:06:19 2019 -0700 CIFS: Force revalidate inode when dentry is stale commit c82e5ac7fe3570a269c0929bf7899f62048e7dbc upstream. Currently the client indicates that a dentry is stale when inode numbers or type types between a local inode and a remote file don't match. If this is the case attributes is not being copied from remote to local, so, it is already known that the local copy has stale metadata. That's why the inode needs to be marked for revalidation in order to tell the VFS to lookup the dentry again before openning a file. This prevents unexpected stale errors to be returned to the user space when openning a file. Signed-off-by: Pavel Shilovsky Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit 4fbe40871decdbc815900ccc95258ddce9eb5991 Author: Ross Lagerwall Date: Wed Dec 2 14:46:08 2015 +0000 cifs: Check uniqueid for SMB2+ and return -ESTALE if necessary commit a108471b5730b52017e73b58c9f486319d2ac308 upstream. Commit 7196ac113a4f ("Fix to check Unique id and FileType when client refer file directly.") checks whether the uniqueid of an inode has changed when getting the inode info, but only when using the UNIX extensions. Add a similar check for SMB2+, since this can be done without an extra network roundtrip. Signed-off-by: Ross Lagerwall Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit b43f022a15f4d3e969f562c933234850c8d47049 Author: Nakajima Akira Date: Wed Apr 22 15:24:44 2015 +0900 Fix to check Unique id and FileType when client refer file directly. commit 7196ac113a4f38b7ca1a3282fd9edf328bd22287 upstream. When you refer file directly on cifs client, (e.g. ls -li , cd , stat ) the function return old inode number and filetype from old inode cache, though server has different inode number or filetype. When server is Windows, cifs client has same problem. When Server is Windows , This patch fixes bug in different filetype, but does not fix bug in different inode number. Because QUERY_PATH_INFO response by Windows does not include inode number(Index Number) . BUG INFO https://bugzilla.kernel.org/show_bug.cgi?id=90021 https://bugzilla.kernel.org/show_bug.cgi?id=90031 Reported-by: Nakajima Akira Signed-off-by: Nakajima Akira Reviewed-by: Shirish Pargaonkar Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit da7156ad13b8eb8ad32774769a1117df41474d41 Author: Eric Biggers Date: Sun Oct 6 14:24:24 2019 -0700 llc: fix sk_buff leak in llc_sap_state_process() commit c6ee11c39fcc1fb55130748990a8f199e76263b4 upstream. syzbot reported: BUG: memory leak unreferenced object 0xffff888116270800 (size 224): comm "syz-executor641", pid 7047, jiffies 4294947360 (age 13.860s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 20 e1 2a 81 88 ff ff 00 40 3d 2a 81 88 ff ff . .*.....@=*.... backtrace: [<000000004d41b4cc>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000004d41b4cc>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000004d41b4cc>] slab_alloc_node mm/slab.c:3269 [inline] [<000000004d41b4cc>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000506a5965>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<000000001ba5a161>] alloc_skb include/linux/skbuff.h:1058 [inline] [<000000001ba5a161>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<0000000047d9c78b>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<000000003828fe54>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e34d94f9>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000de2de3fb>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000de2de3fb>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000008fe16e7a>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [...] The bug is that llc_sap_state_process() always takes an extra reference to the skb, but sometimes neither llc_sap_next_state() nor llc_sap_state_process() itself drops this reference. Fix it by changing llc_sap_next_state() to never consume a reference to the skb, rather than sometimes do so and sometimes not. Then remove the extra skb_get() and kfree_skb() from llc_sap_state_process(). Reported-by: syzbot+6bf095f9becf5efef645@syzkaller.appspotmail.com Reported-by: syzbot+31c16aa4202dace3812e@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Biggers Signed-off-by: Jakub Kicinski Signed-off-by: Ben Hutchings commit ca0b3c74c2bba563e4d8399a264af882fb5641ab Author: Will Deacon Date: Sun Oct 6 17:58:00 2019 -0700 panic: ensure preemption is disabled during panic() commit 20bb759a66be52cf4a9ddd17fddaf509e11490cd upstream. Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the calling CPU in an infinite loop, but with interrupts and preemption enabled. From this state, userspace can continue to be scheduled, despite the system being "dead" as far as the kernel is concerned. This is easily reproducible on arm64 when booting with "nosmp" on the command line; a couple of shell scripts print out a periodic "Ping" message whilst another triggers a crash by writing to /proc/sysrq-trigger: | sysrq: Trigger a crash | Kernel panic - not syncing: sysrq triggered crash | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x148 | show_stack+0x14/0x20 | dump_stack+0xa0/0xc4 | panic+0x140/0x32c | sysrq_handle_reboot+0x0/0x20 | __handle_sysrq+0x124/0x190 | write_sysrq_trigger+0x64/0x88 | proc_reg_write+0x60/0xa8 | __vfs_write+0x18/0x40 | vfs_write+0xa4/0x1b8 | ksys_write+0x64/0xf0 | __arm64_sys_write+0x14/0x20 | el0_svc_common.constprop.0+0xb0/0x168 | el0_svc_handler+0x28/0x78 | el0_svc+0x8/0xc | Kernel Offset: disabled | CPU features: 0x0002,24002004 | Memory Limit: none | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- | Ping 2! | Ping 1! | Ping 1! | Ping 2! The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise local interrupts are disabled in 'smp_send_stop()'. Disable preemption in 'panic()' before re-enabling interrupts. Link: http://lkml.kernel.org/r/20191002123538.22609-1-will@kernel.org Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone Signed-off-by: Will Deacon Reported-by: Xogium Reviewed-by: Kees Cook Cc: Russell King Cc: Greg Kroah-Hartman Cc: Ingo Molnar Cc: Petr Mladek Cc: Feng Tang Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 27646afab73bb4d602ffb6d4ef7e570ae9559fbc Author: Pavel Shilovsky Date: Mon Sep 30 10:06:18 2019 -0700 CIFS: Gracefully handle QueryInfo errors during open commit 30573a82fb179420b8aac30a3a3595aa96a93156 upstream. Currently if the client identifies problems when processing metadata returned in CREATE response, the open handle is being leaked. This causes multiple problems like a file missing a lease break by that client which causes high latencies to other clients accessing the file. Another side-effect of this is that the file can't be deleted. Fix this by closing the file after the client hits an error after the file was opened and the open descriptor wasn't returned to the user space. Also convert -ESTALE to -EOPENSTALE to allow the VFS to revalidate a dentry and retry the open. Signed-off-by: Pavel Shilovsky Signed-off-by: Steve French Signed-off-by: Ben Hutchings commit 9fa9c410a7b2c91d7d1a482485ea67fdf0e39950 Author: Eric Dumazet Date: Fri Oct 4 11:08:34 2019 -0700 nfc: fix memory leak in llcp_sock_bind() commit a0c2dc1fe63e2869b74c1c7f6a81d1745c8a695d upstream. sysbot reported a memory leak after a bind() has failed. While we are at it, abort the operation if kmemdup() has failed. BUG: memory leak unreferenced object 0xffff888105d83ec0 (size 32): comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s) hex dump (first 32 bytes): 00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34 .ile read.net:[4 30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00 026533097]...... backtrace: [<0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline] [<0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline] [<0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline] [<0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline] [<0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670 [<000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120 [<000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline] [<000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107 [<000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647 [<00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline] [<00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline] [<00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656 [<0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296 [<000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 30cc4587659e ("NFC: Move LLCP code to the NFC top level diirectory") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit e070a3109850948388f8265a03a8e674683bc3c8 Author: Eric Dumazet Date: Fri Oct 4 10:34:45 2019 -0700 sch_dsmark: fix potential NULL deref in dsmark_init() commit 474f0813a3002cb299bb73a5a93aa1f537a80ca8 upstream. Make sure TCA_DSMARK_INDICES was provided by the user. syzbot reported : kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline] RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline] RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339 Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca RSP: 0018:ffff88809426f3b8 EFLAGS: 00010247 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff85c6eb09 RDX: 0000000000000000 RSI: ffffffff85c6eb17 RDI: 0000000000000004 RBP: ffff88809426f4b0 R08: ffff88808c4085c0 R09: ffffed1015d26159 R10: ffffed1015d26158 R11: ffff8880ae930ac7 R12: ffff8880a7e96940 R13: dffffc0000000000 R14: ffff88809426f8c0 R15: 0000000000000000 FS: 0000000001292880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000080 CR3: 000000008ca1b000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237 tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653 rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x803/0x920 net/socket.c:2311 __sys_sendmsg+0x105/0x1d0 net/socket.c:2356 __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg net/socket.c:2363 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440369 Fixes: 758cc43c6d73 ("[PKT_SCHED]: Fix dsmark to apply changes consistent") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit cce25545927688855dd10a7b61c64f350b1af818 Author: Paolo Abeni Date: Fri Oct 4 15:11:17 2019 +0200 net: ipv4: avoid mixed n_redirects and rate_tokens usage commit b406472b5ad79ede8d10077f0c8f05505ace8b6d upstream. Since commit c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets") we use 'n_redirects' to account for redirect packets, but we still use 'rate_tokens' to compute the redirect packets exponential backoff. If the device sent to the relevant peer any ICMP error packet after sending a redirect, it will also update 'rate_token' according to the leaking bucket schema; typically 'rate_token' will raise above BITS_PER_LONG and the redirect packets backoff algorithm will produce undefined behavior. Fix the issue using 'n_redirects' to compute the exponential backoff in ip_rt_send_redirect(). Note that we still clear rate_tokens after a redirect silence period, to avoid changing an established behaviour. The root cause predates git history; before the mentioned commit in the critical scenario, the kernel stopped sending redirects, after the mentioned commit the behavior more randomic. Reported-by: Xiumei Mu Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets") Signed-off-by: Paolo Abeni Acked-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit d4df303d9f271ac3bb45bea6e913aff32f8f9239 Author: Lorenzo Bianconi Date: Wed Feb 6 19:18:04 2019 +0100 net: ipv4: use a dedicated counter for icmp_v4 redirect packets commit c09551c6ff7fe16a79a42133bcecba5fc2fc3291 upstream. According to the algorithm described in the comment block at the beginning of ip_rt_send_redirect, the host should try to send 'ip_rt_redirect_number' ICMP redirect packets with an exponential backoff and then stop sending them at all assuming that the destination ignores redirects. If the device has previously sent some ICMP error packets that are rate-limited (e.g TTL expired) and continues to receive traffic, the redirect packets will never be transmitted. This happens since peer->rate_tokens will be typically greater than 'ip_rt_redirect_number' and so it will never be reset even if the redirect silence timeout (ip_rt_redirect_silence) has elapsed without receiving any packet requiring redirects. Fix it by using a dedicated counter for the number of ICMP redirect packets that has been sent by the host I have not been able to identify a given commit that introduced the issue since ip_rt_send_redirect implements the same rate-limiting algorithm from commit 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller [bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings commit ed5da1455def82331805e192c03e4eb909276257 Author: Randy Dunlap Date: Mon Sep 16 16:12:23 2019 -0700 serial: uartlite: fix exit path null pointer commit a553add0846f355a28ed4e81134012e4a1e280c2 upstream. Call uart_unregister_driver() conditionally instead of unconditionally, only if it has been previously registered. This uses driver.state, just as the sh-sci.c driver does. Fixes this null pointer dereference in tty_unregister_driver(), since the 'driver' argument is null: general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI RIP: 0010:tty_unregister_driver+0x25/0x1d0 Fixes: 238b8721a554 ("[PATCH] serial uartlite driver") Signed-off-by: Randy Dunlap Cc: Peter Korsgaard Link: https://lore.kernel.org/r/9c8e6581-6fcc-a595-0897-4d90f5d710df@infradead.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 6e6d80f302ad1fcc3ef3b290d9796cbad2bd1a5e Author: Johan Hovold Date: Tue Oct 1 10:49:08 2019 +0200 media: stkwebcam: fix runtime PM after driver unbind commit 30045f2174aab7fb4db7a9cf902d0aa6c75856a7 upstream. Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") USB drivers must always balance their runtime PM gets and puts, including when the driver has already been unbound from the interface. Leaving the interface with a positive PM usage counter would prevent a later bound driver from suspending the device. Note that runtime PM has never actually been enabled for this driver since the support_autosuspend flag in its usb_driver struct is not set. Fixes: c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") Acked-by: Mauro Carvalho Chehab Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191001084908.2003-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 1e2d7c75364c91b5b7cb2d7e3cff7d868f3e9749 Author: Johan Hovold Date: Tue Oct 1 10:49:07 2019 +0200 USB: serial: fix runtime PM after driver unbind commit d51bdb93ca7e71d7fb30a572c7b47ed0194bf3fe upstream. Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") USB drivers must always balance their runtime PM gets and puts, including when the driver has already been unbound from the interface. Leaving the interface with a positive PM usage counter would prevent a later bound driver from suspending the device. Fixes: c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191001084908.2003-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 31c3547d030bdee31840946277415431763ac174 Author: Johan Hovold Date: Tue Oct 1 10:49:06 2019 +0200 USB: usblp: fix runtime PM after driver unbind commit 9a31535859bfd8d1c3ed391f5e9247cd87bb7909 upstream. Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") USB drivers must always balance their runtime PM gets and puts, including when the driver has already been unbound from the interface. Leaving the interface with a positive PM usage counter would prevent a later bound driver from suspending the device. Fixes: c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191001084908.2003-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 9e3c26ddbcce4d4205b3cdbdfb74413c1e1f5a26 Author: Johan Hovold Date: Tue Oct 1 10:49:05 2019 +0200 USB: usb-skeleton: fix runtime PM after driver unbind commit 5c290a5e42c3387e82de86965784d30e6c5270fd upstream. Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") USB drivers must always balance their runtime PM gets and puts, including when the driver has already been unbound from the interface. Leaving the interface with a positive PM usage counter would prevent a later bound driver from suspending the device. Fixes: c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20191001084908.2003-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 080c32c9438b4269d9549fea5f7d5476af107e12 Author: Yoshihiro Shimoda Date: Tue Oct 1 19:10:33 2019 +0900 usb: renesas_usbhs: gadget: Fix usb_ep_set_{halt,wedge}() behavior commit 4d599cd3a097a85a5c68a2c82b9a48cddf9953ec upstream. According to usb_ep_set_halt()'s description, __usbhsg_ep_set_halt_wedge() should return -EAGAIN if the IN endpoint has any queue or data. Otherwise, this driver is possible to cause just STALL without sending a short packet data on g_mass_storage driver, and then a few resetting a device happens on a host side during a usb enumaration. Fixes: 2f98382dcdfe ("usb: renesas_usbhs: Add Renesas USBHS Gadget") Signed-off-by: Yoshihiro Shimoda Link: https://lore.kernel.org/r/1569924633-322-3-git-send-email-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 333fe262de5949c5b9f4131892a3969561dad9c3 Author: Yoshihiro Shimoda Date: Tue Oct 1 19:10:32 2019 +0900 usb: renesas_usbhs: gadget: Do not discard queues in usb_ep_set_{halt,wedge}() commit 1aae1394294cb71c6aa0bc904a94a7f2f1e75936 upstream. The commit 97664a207bc2 ("usb: renesas_usbhs: shrink spin lock area") had added a usbhsg_pipe_disable() calling into __usbhsg_ep_set_halt_wedge() accidentally. But, this driver should not call the usbhsg_pipe_disable() because the function discards all queues. So, this patch removes it. Fixes: 97664a207bc2 ("usb: renesas_usbhs: shrink spin lock area") Signed-off-by: Yoshihiro Shimoda Link: https://lore.kernel.org/r/1569924633-322-2-git-send-email-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit c8f4c46277ceac473dd1aa5b80c116f0fdbb7080 Author: Rick Tseng Date: Fri Oct 4 14:59:30 2019 +0300 usb: xhci: wait for CNR controller not ready bit in xhci resume commit a70bcbc322837eda1ab5994d12db941dc9733a7d upstream. NVIDIA 3.1 xHCI card would lose power when moving power state into D3Cold. Thus we need to wait for CNR bit to clear in xhci resume, just as in xhci init. [Minor changes to comment and commit message -Mathias] Signed-off-by: Rick Tseng Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/1570190373-30684-6-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: Add xhci as parameter to xhci_handshake()] Signed-off-by: Ben Hutchings commit 5a800ae59799e6ee89a83ad3f3d8a452492af023 Author: Jan Schmidt Date: Fri Oct 4 14:59:28 2019 +0300 xhci: Check all endpoints for LPM timeout commit d500c63f80f2ea08ee300e57da5f2af1c13875f5 upstream. If an endpoint is encountered that returns USB3_LPM_DEVICE_INITIATED, keep checking further endpoints, as there might be periodic endpoints later that return USB3_LPM_DISABLED due to shorter service intervals. Without this, the code can set too high a maximum-exit-latency and prevent the use of multiple USB3 cameras that should be able to work. Signed-off-by: Jan Schmidt Tested-by: Philipp Zabel Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/1570190373-30684-4-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 37f94b7b4b204258e6f62778d6cbff290cdb71fc Author: Mathias Nyman Date: Fri Oct 4 14:59:27 2019 +0300 xhci: Prevent device initiated U1/U2 link pm if exit latency is too long commit cd9d9491e835a845c1a98b8471f88d26285e0bb9 upstream. If host/hub initiated link pm is prevented by a driver flag we still must ensure that periodic endpoints have longer service intervals than link pm exit latency before allowing device initiated link pm. Fix this by continue walking and checking endpoint service interval if xhci_get_timeout_no_hub_lpm() returns anything else than USB3_LPM_DISABLED While at it fix the split line error message Tested-by: Jan Schmidt Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/1570190373-30684-3-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 8424406219241a0317aa9ddb6416bbfb115dd5bd Author: Johan Hovold Date: Thu Sep 19 10:30:39 2019 +0200 USB: legousbtower: fix open after failed reset request commit 0b074f6986751361ff442bc1127c1648567aa8d6 upstream. The driver would return with a nonzero open count in case the reset control request failed. This would prevent any further attempts to open the char dev until the device was disconnected. Fix this by incrementing the open count only on successful open. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190919083039.30898-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 85034845c7b94ee3a04c54614eb19e04bf1d1000 Author: Johan Hovold Date: Thu Sep 19 10:30:38 2019 +0200 USB: legousbtower: fix potential NULL-deref on disconnect commit cd81e6fa8e033e7bcd59415b4a65672b4780030b upstream. The driver is using its struct usb_device pointer as an inverted disconnected flag, but was setting it to NULL before making sure all completion handlers had run. This could lead to a NULL-pointer dereference in a number of dev_dbg and dev_err statements in the completion handlers which relies on said pointer. Fix this by unconditionally stopping all I/O and preventing resubmissions by poisoning the interrupt URBs at disconnect and using a dedicated disconnected flag. This also makes sure that all I/O has completed by the time the disconnect callback returns. Fixes: 9d974b2a06e3 ("USB: legousbtower.c: remove err() usage") Fixes: fef526cae700 ("USB: legousbtower: remove custom debug macro") Fixes: 4dae99638097 ("USB: legotower: remove custom debug macro and module parameter") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190919083039.30898-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit c691a736a217ebba198a2904ec32fafbd1b8fa5f Author: Johan Hovold Date: Thu Sep 19 10:30:37 2019 +0200 USB: legousbtower: fix deadlock on disconnect commit 33a7813219f208f4952ece60ee255fd983272dec upstream. Fix a potential deadlock if disconnect races with open. Since commit d4ead16f50f9 ("USB: prevent char device open/deregister race") core holds an rw-semaphore while open is called and when releasing the minor number during deregistration. This can lead to an ABBA deadlock if a driver takes a lock in open which it also holds during deregistration. This effectively reverts commit 78663ecc344b ("USB: disconnect open race in legousbtower") which needlessly introduced this issue after a generic fix for this race had been added to core by commit d4ead16f50f9 ("USB: prevent char device open/deregister race"). Fixes: 78663ecc344b ("USB: disconnect open race in legousbtower") Reported-by: syzbot+f9549f5ee8a5416f0b95@syzkaller.appspotmail.com Tested-by: syzbot+f9549f5ee8a5416f0b95@syzkaller.appspotmail.com Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190919083039.30898-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 6619f1ebea52b1b0aa19761c77f8b49e63c2b6c2 Author: Johan Hovold Date: Thu Sep 19 10:30:36 2019 +0200 USB: legousbtower: fix slab info leak at probe commit 1d427be4a39defadda6dd8f4659bc17f7591740f upstream. Make sure to check for short transfers when retrieving the version information at probe to avoid leaking uninitialised slab data when logging it. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190919083039.30898-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 09c97407c67b336bef3306c24fdf8738f602cb45 Author: Will Deacon Date: Fri Oct 4 10:51:31 2019 +0100 mac80211: Reject malformed SSID elements commit 4152561f5da3fca92af7179dd538ea89e248f9d0 upstream. Although this shouldn't occur in practice, it's a good idea to bounds check the length field of the SSID element prior to using it for things like allocations or memcpy operations. Cc: Kees Cook Reported-by: Nicolas Waisman Signed-off-by: Will Deacon Link: https://lore.kernel.org/r/20191004095132.15777-1-will@kernel.org Signed-off-by: Johannes Berg Signed-off-by: Ben Hutchings commit 6c6e0197249d41a64fbcef92d34ef8b0c4d590c9 Author: Johan Hovold Date: Thu Sep 26 11:12:25 2019 +0200 USB: usblcd: fix I/O after disconnect commit eb7f5a490c5edfe8126f64bc58b9ba2edef0a425 upstream. Make sure to stop all I/O on disconnect by adding a disconnected flag which is used to prevent new I/O from being started and by stopping all ongoing I/O before returning. This also fixes a potential use-after-free on driver unbind in case the driver data is freed before the completion handler has run. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190926091228.24634-7-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit be133edf1148b09defec433deda018d69ef96036 Author: Jacky.Cao@sony.com Date: Thu Sep 5 04:11:57 2019 +0000 USB: dummy-hcd: fix power budget for SuperSpeed mode commit 2636d49b64671d3d90ecc4daf971b58df3956519 upstream. The power budget for SuperSpeed mode should be 900 mA according to USB specification, so set the power budget to 900mA for dummy_start_ss which is only used for SuperSpeed mode. If the max power consumption of SuperSpeed device is larger than 500 mA, insufficient available bus power error happens in usb_choose_configuration function when the device connects to dummy hcd. Signed-off-by: Jacky Cao Acked-by: Alan Stern Link: https://lore.kernel.org/r/16EA1F625E922C43B00B9D82250220500871CDE5@APYOKXMS108.ap.sony.com Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings commit d117411de20b9de7d8e07e426a579cadb802d559 Author: Alan Stern Date: Tue Sep 17 12:47:23 2019 -0400 USB: yurex: Don't retry on unexpected errors commit 32a0721c6620b77504916dac0cea8ad497c4878a upstream. According to Greg KH, it has been generally agreed that when a USB driver encounters an unknown error (or one it can't handle directly), it should just give up instead of going into a potentially infinite retry loop. The three codes -EPROTO, -EILSEQ, and -ETIME fall into this category. They can be caused by bus errors such as packet loss or corruption, attempting to communicate with a disconnected device, or by malicious firmware. Nowadays the extent of packet loss or corruption is negligible, so it should be safe for a driver to give up whenever one of these errors occurs. Although the yurex driver handles -EILSEQ errors in this way, it doesn't do the same for -EPROTO (as discovered by the syzbot fuzzer) or other unrecognized errors. This patch adjusts the driver so that it doesn't log an error message for -EPROTO or -ETIME, and it doesn't retry after any errors. Reported-and-tested-by: syzbot+b24d736f18a1541ad550@syzkaller.appspotmail.com Signed-off-by: Alan Stern CC: Tomoki Sekiyama Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1909171245410.1590-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 01b8eaf4a17012f57b73e87e325acb0d09321680 Author: Johan Hovold Date: Wed Sep 25 11:29:13 2019 +0200 USB: adutux: fix NULL-derefs on disconnect commit b2fa7baee744fde746c17bc1860b9c6f5c2eebb7 upstream. The driver was using its struct usb_device pointer as an inverted disconnected flag, but was setting it to NULL before making sure all completion handlers had run. This could lead to a NULL-pointer dereference in a number of dev_dbg statements in the completion handlers which relies on said pointer. The pointer was also dereferenced unconditionally in a dev_dbg statement release() something which would lead to a NULL-deref whenever a device was disconnected before the final character-device close if debugging was enabled. Fix this by unconditionally stopping all I/O and preventing resubmissions by poisoning the interrupt URBs at disconnect and using a dedicated disconnected flag. This also makes sure that all I/O has completed by the time the disconnect callback returns. Fixes: 1ef37c6047fe ("USB: adutux: remove custom debug macro and module parameter") Fixes: 66d4bc30d128 ("USB: adutux: remove custom debug macro") Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20190925092913.8608-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 4511ec1224b09e5a8ff1045681378a00f91a3747 Author: Johan Hovold Date: Thu Oct 3 09:09:31 2019 +0200 USB: microtek: fix info-leak at probe commit 177238c3d47d54b2ed8f0da7a4290db492f4a057 upstream. Add missing bulk-in endpoint sanity check to prevent uninitialised stack data from being reported to the system log and used as endpoint addresses. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+5630ca7c3b2be5c9da5e@syzkaller.appspotmail.com Signed-off-by: Johan Hovold Acked-by: Oliver Neukum Link: https://lore.kernel.org/r/20191003070931.17009-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 3ef1a68d7b222d3d6dfc49ffea3190c5122d11e2 Author: Johan Hovold Date: Thu Oct 3 15:49:58 2019 +0200 USB: serial: keyspan: fix NULL-derefs on open() and write() commit 7d7e21fafdbc7fcf0854b877bd0975b487ed2717 upstream. Fix NULL-pointer dereferences on open() and write() which can be triggered by a malicious USB device. The current URB allocation helper would fail to initialise the newly allocated URB if the device has unexpected endpoint descriptors, something which could lead NULL-pointer dereferences in a number of open() and write() paths when accessing the URB. For example: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:usb_clear_halt+0x11/0xc0 ... Call Trace: ? tty_port_open+0x4d/0xd0 keyspan_open+0x70/0x160 [keyspan] serial_port_activate+0x5b/0x80 [usbserial] tty_port_open+0x7b/0xd0 ? check_tty_count+0x43/0xa0 tty_open+0xf1/0x490 BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:keyspan_write+0x14e/0x1f3 [keyspan] ... Call Trace: serial_write+0x43/0xa0 [usbserial] n_tty_write+0x1af/0x4f0 ? do_wait_intr_irq+0x80/0x80 ? process_echoes+0x60/0x60 tty_write+0x13f/0x2f0 BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:keyspan_usa26_send_setup+0x298/0x305 [keyspan] ... Call Trace: keyspan_open+0x10f/0x160 [keyspan] serial_port_activate+0x5b/0x80 [usbserial] tty_port_open+0x7b/0xd0 ? check_tty_count+0x43/0xa0 tty_open+0xf1/0x490 Fixes: fdcba53e2d58 ("fix for bugzilla #7544 (keyspan USB-to-serial converter)") Reviewed-by: Greg Kroah-Hartman Signed-off-by: Johan Hovold Signed-off-by: Ben Hutchings commit d4e732b99857edefa91b39720b472dbaed8b1f82 Author: Bastien Nocera Date: Mon Sep 23 18:18:43 2019 +0200 USB: rio500: Remove Rio 500 kernel driver commit 015664d15270a112c2371d812f03f7c579b35a73 upstream. The Rio500 kernel driver has not been used by Rio500 owners since 2001 not long after the rio500 project added support for a user-space USB stack through the very first versions of usbdevfs and then libusb. Support for the kernel driver was removed from the upstream utilities in 2008: https://gitlab.freedesktop.org/hadess/rio500/commit/943f624ab721eb8281c287650fcc9e2026f6f5db Cc: Cesar Miquel Signed-off-by: Bastien Nocera Link: https://lore.kernel.org/r/6251c17584d220472ce882a3d9c199c401a51a71.camel@hadess.net Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: - Also delete CONFIG_USB_RIO500 from arch/powerpc/configs/c2k_defconfig - Drop change to arch/arm/configs/pxa_defconfig - Adjust filename, content] Signed-off-by: Ben Hutchings commit 6d19b7228d0cc2ee5c6e22b8251d9f602cbbdd52 Author: Steffen Maier Date: Tue Oct 1 12:49:49 2019 +0200 scsi: zfcp: fix reaction on bit error threshold notification commit 2190168aaea42c31bff7b9a967e7b045f07df095 upstream. On excessive bit errors for the FCP channel ingress fibre path, the channel notifies us. Previously, we only emitted a kernel message and a trace record. Since performance can become suboptimal with I/O timeouts due to bit errors, we now stop using an FCP device by default on channel notification so multipath on top can timely failover to other paths. A new module parameter zfcp.ber_stop can be used to get zfcp old behavior. User explanation of new kernel message: * Description: * The FCP channel reported that its bit error threshold has been exceeded. * These errors might result from a problem with the physical components * of the local fibre link into the FCP channel. * The problem might be damage or malfunction of the cable or * cable connection between the FCP channel and * the adjacent fabric switch port or the point-to-point peer. * Find details about the errors in the HBA trace for the FCP device. * The zfcp device driver closed down the FCP device * to limit the performance impact from possible I/O command timeouts. * User action: * Check for problems on the local fibre link, ensure that fibre optics are * clean and functional, and all cables are properly plugged. * After the repair action, you can manually recover the FCP device by * writing "0" into its "failed" sysfs attribute. * If recovery through sysfs is not possible, set the CHPID of the device * offline and back online on the service element. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20191001104949.42810-1-maier@linux.ibm.com Reviewed-by: Jens Remus Reviewed-by: Benjamin Block Signed-off-by: Steffen Maier Signed-off-by: Martin K. Petersen [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 60e76f748865728268357b7aa6a816f64108a693 Author: Eric Dumazet Date: Wed Oct 2 09:38:55 2019 -0700 ipv6: drop incoming packets having a v4mapped source address commit 6af1799aaf3f1bc8defedddfa00df3192445bbf3 upstream. This began with a syzbot report. syzkaller was injecting IPv6 TCP SYN packets having a v4mapped source address. After an unsuccessful 4-tuple lookup, TCP creates a request socket (SYN_RECV) and calls reqsk_queue_hash_req() reqsk_queue_hash_req() calls sk_ehashfn(sk) At this point we have AF_INET6 sockets, and the heuristic used by sk_ehashfn() to either hash the IPv4 or IPv6 addresses is to use ipv6_addr_v4mapped(&sk->sk_v6_daddr) For the particular spoofed packet, we end up hashing V4 addresses which were not initialized by the TCP IPv6 stack, so KMSAN fired a warning. I first fixed sk_ehashfn() to test both source and destination addresses, but then faced various problems, including user-space programs like packetdrill that had similar assumptions. Instead of trying to fix the whole ecosystem, it is better to admit that we have a dual stack behavior, and that we can not build linux kernels without V4 stack anyway. The dual stack API automatically forces the traffic to be IPv4 if v4mapped addresses are used at bind() or connect(), so it makes no sense to allow IPv6 traffic to use the same v4mapped class. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Cc: Florian Westphal Cc: Hannes Frederic Sowa Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 4b911101a5cd44fc43c1f4f2b3f162ca58c4ac42 Author: Tomi Valkeinen Date: Wed Oct 2 15:25:42 2019 +0300 drm/omap: fix max fclk divider for omap36xx commit e2c4ed148cf3ec8669a1d90dc66966028e5fad70 upstream. The OMAP36xx and AM/DM37x TRMs say that the maximum divider for DSS fclk (in CM_CLKSEL_DSS) is 32. Experimentation shows that this is not correct, and using divider of 32 breaks DSS with a flood or underflows and sync losts. Dividers up to 31 seem to work fine. There is another patch to the DT files to limit the divider correctly, but as the DSS driver also needs to know the maximum divider to be able to iteratively find good rates, we also need to do the fix in the DSS driver. Signed-off-by: Tomi Valkeinen Cc: Adam Ford Link: https://patchwork.freedesktop.org/patch/msgid/20191002122542.8449-1-tomi.valkeinen@ti.com Tested-by: Adam Ford Reviewed-by: Jyri Sarha [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings commit 099a8d6ded426b7e3baaf5cc67aa2c2a8dce2ca8 Author: Beni Mahler Date: Thu Sep 5 00:26:20 2019 +0200 USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20 commit 357f16d9e0194cdbc36531ff88b453481560b76a upstream. Both devices added here have a FTDI chip inside. The device from Echelon is called 'Network Interface' it is actually a LON network gateway. ID 0403:8348 Future Technology Devices International, Ltd https://www.eltako.com/fileadmin/downloads/de/datenblatt/Datenblatt_PL-SW-PROF.pdf ID 0920:7500 Network Interface https://www.echelon.com/products/u20-usb-network-interface Signed-off-by: Beni Mahler Signed-off-by: Johan Hovold Signed-off-by: Ben Hutchings commit d16d5753bddbf191b17bd9c77f530ed8dd01fba9 Author: Johan Hovold Date: Mon Sep 30 17:12:41 2019 +0200 hso: fix NULL-deref on tty open commit 8353da9fa69722b54cba82b2ec740afd3d438748 upstream. Fix NULL-pointer dereference on tty open due to a failure to handle a missing interrupt-in endpoint when probing modem ports: BUG: kernel NULL pointer dereference, address: 0000000000000006 ... RIP: 0010:tiocmget_submit_urb+0x1c/0xe0 [hso] ... Call Trace: hso_start_serial_device+0xdc/0x140 [hso] hso_serial_open+0x118/0x1b0 [hso] tty_open+0xf1/0x490 Fixes: 542f54823614 ("tty: Modem functions for the HSO driver") Signed-off-by: Johan Hovold Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit ba93d2ce27b0f5245ac8e278e46bfd0a0fefb293 Author: Jose Abreu Date: Mon Sep 30 10:19:09 2019 +0200 net: stmmac: Correctly take timestamp for PTPv2 commit 14f347334bf232074616e29e29103dd0c7c54dec upstream. The case for PTPV2_EVENT requires event packets to be captured so add this setting to the list of enabled captures. Fixes: 891434b18ec0 ("stmmac: add IEEE PTPv1 and PTPv2 support.") Signed-off-by: Jose Abreu Signed-off-by: David S. Miller [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit e67eb2b7ed4c86eee4699914198eb03b6ea3a63d Author: Bart Van Assche Date: Mon Sep 30 16:16:54 2019 -0700 RDMA/iwcm: Fix a lock inversion issue commit b66f31efbdad95ec274345721d99d1d835e6de01 upstream. This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Ben Hutchings commit 1720c6f9aefcab68a4e7c058db44127a646ca9bf Author: Michał Mirosław Date: Fri Aug 23 21:15:27 2019 +0200 HID: fix error message in hid_open_report() commit b3a81c777dcb093020680490ab970d85e2f6f04f upstream. On HID report descriptor parsing error the code displays bogus pointer instead of error offset (subtracts start=NULL from end). Make the message more useful by displaying correct error offset and include total buffer size for reference. This was carried over from ancient times - "Fixed" commit just promoted the message from DEBUG to ERROR. Fixes: 8c3d52fc393b ("HID: make parser more verbose about parsing errors by default") Signed-off-by: Michał Mirosław Signed-off-by: Jiri Kosina Signed-off-by: Ben Hutchings commit 584df56f99dd94362a24c4ed710239b8965d8791 Author: Denis Efremov Date: Thu Sep 26 10:31:38 2019 +0300 staging: rtl8188eu: fix HighestRate check in odm_ARFBRefresh_8188E() commit 22d67a01d8d89552b989c9651419824bb4111200 upstream. It's incorrect to compare HighestRate with 0x0b twice in the following manner "if (HighestRate > 0x0b) ... else if (HighestRate > 0x0b) ...". The "else if" branch is constantly false. The second comparision should be with 0x03 according to the max_rate_idx in ODM_RAInfo_Init(). Cc: Michael Straube Signed-off-by: Denis Efremov Acked-by: Larry Finger Link: https://lore.kernel.org/r/20190926073138.12109-1-efremov@linux.com Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.16: adjust filename, indentation] Signed-off-by: Ben Hutchings commit bc0104b28a60413fc7d2d83c5a32522e1550a8d4 Author: Oliver Neukum Date: Tue Sep 3 12:18:39 2019 +0200 scsi: sd: Ignore a failure to sync cache due to lack of authorization commit 21e3d6c81179bbdfa279efc8de456c34b814cfd2 upstream. I've got a report about a UAS drive enclosure reporting back Sense: Logical unit access not authorized if the drive it holds is password protected. While the drive is obviously unusable in that state as a mass storage device, it still exists as a sd device and when the system is asked to perform a suspend of the drive, it will be sent a SYNCHRONIZE CACHE. If that fails due to password protection, the error must be ignored. Link: https://lore.kernel.org/r/20190903101840.16483-1-oneukum@suse.com Signed-off-by: Oliver Neukum Signed-off-by: Martin K. Petersen [bwh: Backported to 3.16: sshdr is a struct not a pointer here] Signed-off-by: Ben Hutchings commit 47715490ac6d7c5087e8ef2288ef68899cf9e027 Author: Eric Dumazet Date: Thu Sep 26 18:24:43 2019 -0700 sch_cbq: validate TCA_CBQ_WRROPT to avoid crash commit e9789c7cc182484fc031fd88097eb14cb26c4596 upstream. syzbot reported a crash in cbq_normalize_quanta() caused by an out of range cl->priority. iproute2 enforces this check, but malicious users do not. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI Modules linked in: CPU: 1 PID: 26447 Comm: syz-executor.1 Not tainted 5.3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:cbq_normalize_quanta.part.0+0x1fd/0x430 net/sched/sch_cbq.c:902 RSP: 0018:ffff8801a5c333b0 EFLAGS: 00010206 RAX: 0000000020000003 RBX: 00000000fffffff8 RCX: ffffc9000712f000 RDX: 00000000000043bf RSI: ffffffff83be8962 RDI: 0000000100000018 RBP: ffff8801a5c33420 R08: 000000000000003a R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ef R13: ffff88018da95188 R14: dffffc0000000000 R15: 0000000000000015 FS: 00007f37d26b1700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004c7cec CR3: 00000001bcd0a006 CR4: 00000000001626f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [] cbq_normalize_quanta include/net/pkt_sched.h:27 [inline] [] cbq_addprio net/sched/sch_cbq.c:1097 [inline] [] cbq_set_wrr+0x2d7/0x450 net/sched/sch_cbq.c:1115 [] cbq_change_class+0x987/0x225b net/sched/sch_cbq.c:1537 [] tc_ctl_tclass+0x555/0xcd0 net/sched/sch_api.c:2329 [] rtnetlink_rcv_msg+0x485/0xc10 net/core/rtnetlink.c:5248 [] netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2510 [] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5266 [] netlink_unicast_kernel net/netlink/af_netlink.c:1324 [inline] [] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1350 [] netlink_sendmsg+0x89a/0xd50 net/netlink/af_netlink.c:1939 [] sock_sendmsg_nosec net/socket.c:673 [inline] [] sock_sendmsg+0x12e/0x170 net/socket.c:684 [] ___sys_sendmsg+0x81d/0x960 net/socket.c:2359 [] __sys_sendmsg+0x105/0x1d0 net/socket.c:2397 [] SYSC_sendmsg net/socket.c:2406 [inline] [] SyS_sendmsg+0x29/0x30 net/socket.c:2404 [] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305 [] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller [bwh: Backported to 3.16: - netlink doesn't support error messages - Keep calling nla_parse_nested() - Adjust context] Signed-off-by: Ben Hutchings commit cb6a7efdd78d5ad3c3b0e4b567b9d6402a109fda Author: Balasubramani Vivekanandan Date: Thu Sep 26 15:51:01 2019 +0200 tick: broadcast-hrtimer: Fix a race in bc_set_next commit b9023b91dd020ad7e093baa5122b6968c48cc9e0 upstream. When a cpu requests broadcasting, before starting the tick broadcast hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide the required synchronization when the callback is active on other core. The callback could have already executed tick_handle_oneshot_broadcast() and could have also returned. But still there is a small time window where the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns without doing anything, but the next_event of the tick broadcast clock device is already set to a timeout value. In the race condition diagram below, CPU #1 is running the timer callback and CPU #2 is entering idle state and so calls bc_set_next(). In the worst case, the next_event will contain an expiry time, but the hrtimer will not be started which happens when the racing callback returns HRTIMER_NORESTART. The hrtimer might never recover if all further requests from the CPUs to subscribe to tick broadcast have timeout greater than the next_event of tick broadcast clock device. This leads to cascading of failures and finally noticed as rcu stall warnings Here is a depiction of the race condition CPU #1 (Running timer callback) CPU #2 (Enter idle and subscribe to tick broadcast) --------------------- --------------------- __run_hrtimer() tick_broadcast_enter() bc_handler() __tick_broadcast_oneshot_control() tick_handle_oneshot_broadcast() raw_spin_lock(&tick_broadcast_lock); dev->next_event = KTIME_MAX; //wait for tick_broadcast_lock //next_event for tick broadcast clock set to KTIME_MAX since no other cores subscribed to tick broadcasting raw_spin_unlock(&tick_broadcast_lock); if (dev->next_event == KTIME_MAX) return HRTIMER_NORESTART // callback function exits without restarting the hrtimer //tick_broadcast_lock acquired raw_spin_lock(&tick_broadcast_lock); tick_broadcast_set_event() clockevents_program_event() dev->next_event = expires; bc_set_next() hrtimer_try_to_cancel() //returns -1 since the timer callback is active. Exits without restarting the timer cpu_base->running = NULL; The comment that hrtimer cannot be armed from within the callback is wrong. It is fine to start the hrtimer from within the callback. Also it is safe to start the hrtimer from the enter/exit idle code while the broadcast handler is active. The enter/exit idle code and the broadcast handler are synchronized using tick_broadcast_lock. So there is no need for the existing try to cancel logic. All this can be removed which will eliminate the race condition as well. Fixes: 5d1638acb9f6 ("tick: Introduce hrtimer based broadcast") Originally-by: Thomas Gleixner Signed-off-by: Balasubramani Vivekanandan Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings commit 8c5db0803277401d2528c1dc76de78f775a0ad91 Author: Andreas Sandberg Date: Fri Apr 24 13:06:05 2015 +0000 tick: hrtimer-broadcast: Prevent endless restarting when broadcast device is unused commit 38d23a6cc16c02f7b0c920266053f340b5601735 upstream. The hrtimer callback in the hrtimer's tick broadcast code sometimes incorrectly ends up scheduling events at the current tick causing the kernel to hang servicing the same hrtimer forever. This typically happens when a device is swapped out by tick_install_broadcast_device(), which replaces the event handler with clock_events_handle_noop() and sets the device mode to CLOCK_EVT_MODE_UNUSED. If the timer is scheduled when this happens, the next_event field will not be updated and the hrtimer ends up being restarted at the current tick. To prevent this from happening, only try to restart the hrtimer if the broadcast clock event device is in one of the active modes and try to cancel the timer when entering the CLOCK_EVT_MODE_UNUSED mode. Signed-off-by: Andreas Sandberg Tested-by: Catalin Marinas Acked-by: Mark Rutland Reviewed-by: Preeti U Murthy Link: http://lkml.kernel.org/r/1429880765-5558-1-git-send-email-andreas.sandberg@arm.com Signed-off-by: Thomas Gleixner [bwh: Backported to 3.16 as dependency of commit b9023b91dd02 "tick: broadcast-hrtimer: Fix a race in bc_set_next"] Signed-off-by: Ben Hutchings commit ace2783ac2760821c64a05c5de4e10b224097ca4 Author: Thomas Gleixner Date: Tue Apr 14 21:09:22 2015 +0000 tick: broadcast-hrtimer: Remove overly clever return value abuse commit b8a62f1ff0ccb18fdc25c6150d1cd394610f4753 upstream. The assignment of bc_moved in the conditional construct relies on the fact that in the case of hrtimer_start() invocation the return value is always 0. It took me a while to understand it. We want to get rid of the hrtimer_start() return value. Open code the logic which makes it readable as well. Signed-off-by: Thomas Gleixner Reviewed-by: Preeti U Murthy Acked-by: Peter Zijlstra Cc: Viresh Kumar Cc: Marcelo Tosatti Cc: Frederic Weisbecker Link: http://lkml.kernel.org/r/20150414203503.404751457@linutronix.de Signed-off-by: Thomas Gleixner [bwh: Backported to 3.16 to ease backporting commit b9023b91dd02 "tick: broadcast-hrtimer: Fix a race in bc_set_next"] Signed-off-by: Ben Hutchings commit 82a084a03d4b842b704de4d4ca9dfcdf3cc45f1c Author: Viresh Kumar Date: Sun Jun 22 01:29:15 2014 +0200 hrtimer: Store cpu-number in struct hrtimer_cpu_base commit cddd02489f52ccf635ed65931214729a23b93cd6 upstream. In lowres mode, hrtimers are serviced by the tick instead of a clock event. Now it works well as long as the tick stays periodic but we must also make sure that the hrtimers are serviced in dynticks mode. Part of that job consist in kicking a dynticks hrtimer target in order to make it reconsider the next tick to schedule to correctly handle the hrtimer's expiring time. And that part isn't handled by the hrtimers subsystem. To prepare for fixing this, we need __hrtimer_start_range_ns() to be able to resolve the CPU target associated to a hrtimer's object 'cpu_base' so that the kick can be centralized there. So lets store it in the 'struct hrtimer_cpu_base' to resolve the CPU without overhead. It is set once at CPU's online notification. Signed-off-by: Viresh Kumar Signed-off-by: Frederic Weisbecker Link: http://lkml.kernel.org/r/1403393357-2070-4-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner [bwh: Backported to 3.16 as dependency of commit b9023b91dd02 "tick: broadcast-hrtimer: Fix a race in bc_set_next": - Adjust filename, context] Signed-off-by: Ben Hutchings