| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function (or of similar nature) is required to safely use a refcnt
and smr_entry together. Such functions exist on other platforms as
kref_get_unless_zero (on Linux) and refcount_acquire_if_gt (on FreeBSD).
The following diagram details the following situation with and without
refcnt_take_if_gt in 3 cases, with the first showing the "invalid" use
of refcnt_take.
Situation:
Thread #1 is removing the global referenc (o).
Thread #2 wants to reference an object (r), using a thread pointer (t).
Case:
1) refcnt_take after Thread #1 has released "o"
2) refcnt_take_if_gt before Thread #1 has released "o"
3) refcnt_take_if_gt after Thread #1 has released "o"
Data:
struct obj {
struct smr_entry smr;
struct refcnt refcnt;
} *o, *r, *t1, *t2;
Thread #1 | Thread #2
---------------------------------+------------------------------------
| r = NULL;
rw_enter_write(&lock); | smr_read_enter();
|
t1 = SMR_PTR_GET_LOCKED(&o); | t2 = SMR_PTR_GET(&o);
SMR_PTR_SET_LOCKED(&o, NULL); |
|
if (refcnt_rele(&t1->refcnt) |
smr_call(&t1->smr, free, t1); |
| if (t2 != NULL) {
| refcnt_take(&t2->refcnt);
| r = t2;
| }
rw_exit_write(&lock); | smr_read_exit();
.....
// called by smr_thread |
free(t1); |
.....
| // use after free
| *r
---------------------------------+------------------------------------
| r = NULL;
rw_enter_write(&lock); | smr_read_enter();
|
t1 = SMR_PTR_GET_LOCKED(&o); | t2 = SMR_PTR_GET(&o);
SMR_PTR_SET_LOCKED(&o, NULL); |
|
if (refcnt_rele(&t1->refcnt) |
smr_call(&t1->smr, free, t1); |
| if (t2 != NULL &&
| refcnt_take_if_gt(&t2->refcnt, 0))
| r = t2;
rw_exit_write(&lock); | smr_read_exit();
.....
// called by smr_thread | // we don't have a valid reference
free(t1); | assert(r == NULL);
---------------------------------+------------------------------------
| r = NULL;
rw_enter_write(&lock); | smr_read_enter();
|
t1 = SMR_PTR_GET_LOCKED(&o); | t2 = SMR_PTR_GET(&o);
SMR_PTR_SET_LOCKED(&o, NULL); |
| if (t2 != NULL &&
| refcnt_take_if_gt(&t2->refcnt, 0))
| r = t2;
if (refcnt_rele(&t1->refcnt) |
smr_call(&t1->smr, free, t1); |
rw_exit_write(&lock); | smr_read_exit();
.....
| // we need to put our reference
| if (refcnt_rele(&t2->refcnt))
| smr_call(&t2->smr, free, t2);
.....
// called by smr_thread |
free(t1); |
---------------------------------+------------------------------------
Currently it uses atomic_add_int_nv to atomically read the refcnt,
but I'm open to suggestions for better ways.
The atomic_cas_uint is used to ensure that refcnt hasn't been modified
since reading `old`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
case when requested table is already exists.
Except initialization time, route_output() and if_createrdomain() are the
only paths where we call rtable_add(9). We check requested table existence
by rtable_exists(9) and it's not the error condition if the table exists.
Otherwise we are trying to create requested table by rtable_add(9). Those
paths are kernel locked so concurrent thread can't create requested table
just after rtable_exists(9) check. Also rtable_add(9) has internal
rtable_exists(9) check and in this case the table existence assumed as
EEXIST error. This error path is never reached.
We are going to unlock PF_ROUTE sockets. This means route_output() will
not be serialized with if_createrdomain() and concurrent thread could
create requested table. Table existence check and creation should be
serialized and it makes sense to do this within rtable_add(9). This time
kernel lock is used for this so it pushed down to rtable_add(9). The
internal rtable_exists(9) check was modified and table existence is not
error now.
Since the external rtable_exists(9) check is useless it was removed from
if_createrdomain(). It still exists in route_output() path because the
logic is more complicated here.
ok mpi@
|
| |
|
|
|
|
|
|
| |
and installing USD/SMM/PSD docs.
jmc@ agrees with the direction, ok millert@ on an earlier diff
|
|
|
|
|
|
|
| |
i'm not a fan of having to cast to caddr_t when we have modern
inventions like void *s we can take advantage of.
ok claudio@ mvs@ bluhm@
|
|
|
|
| |
ok millert@
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
We don't use static in the kernel due to ddb so functions private
to the compilation unit are basically equivalent.
OK cheloha@ gnezdo@ mglocker@
|
|
|
|
| |
ok kn@ mvs@
|
|
|
|
|
|
|
|
|
|
|
| |
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.
ok claudio@ sashan@
|
|
|
|
| |
ok anton@ kn@
|
| |
|
| |
|
|
|
|
|
|
| |
OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@
|
|
|
|
|
|
|
| |
-width ".Dv BOB" -> -width "BOB"
although they are not errors, they are misleading and probably should
not get pasted around
|
|
|
|
|
|
|
|
|
|
| |
Its value is conflicting with an effort to standardize TX flags fields of
the radiotap header from Mathy Vanhoef.
This flag used to indicate the presence of a specific hardware queue used
by WME but is unchecked.
ok sthen@, kn@
|
|
|
|
| |
called from interrupt context.
|
| |
|
|
|
|
|
|
|
| |
Remove obsolete sysctl_int_arr documentation.
Looks good, deraadt@
reads ok, jmc@
|
| |
|
| |
|
|
|
|
| |
simplifications, add missing markup, and break an overlong line
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RFC 4291 dropped this requirement from RFC 3513:
o An anycast address must not be used as the source address of an
IPv6 packet.
And from that requirement draft-itojun-ipv6-tcp-to-anycast rightly
concluded that TCP connections must be prevented.
The draft also states:
The proposed method MUST be removed when one of the following events
happens in the future:
o Restriction imposed on IPv6 anycast address is loosened, so that
anycast address can be placed into source address field of the IPv6
header[...]
OK jca
|
|
|
|
|
|
|
|
|
|
| |
These two interfaces have been entirely unused since introduction.
Remove them and thin the "timeout" namespace a bit.
Discussed with mpi@ and ratchov@ almost a year ago, though I blocked
the change at that time. Also discussed with visa@.
ok visa@, mpi@
|
|
|
|
| |
Spotted by schwarze@, discussed with kn@ and schwarze@
|
|
|
|
| |
the SYNOPSIS with the modern .Fo idiom; no output change
|
|
|
|
|
| |
it reads better to me. it might be worth considering this for
queue(3) too.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
removal of the vnode interlock in 2007.
Reported by and original diff from Dominik Schreilechner.
OK jmc@
|
|
|
|
| |
OK kn@, "fine" deraadt@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from threads other than the one currently having kcov enabled. A thread
with kcov enabled occasionally delegates work to another thread,
collecting coverage from such threads improves the ability of syzkaller
to correlate side effects in the kernel caused by issuing a syscall.
Remote coverage is divided into subsystems. The only supported subsystem
right now collects coverage from scheduled tasks and timeouts on behalf
of a kcov enabled thread. In order to make this work `struct task' and
`struct timeout' must be extended with a new field keeping track of the
process that scheduled the task/timeout. Both aforementioned structures
have therefore increased with the size of a pointer on all
architectures.
The kernel API is documented in a new kcov_remote_register(9) manual.
Remote coverage is also supported by kcov on NetBSD and Linux.
ok mpi@
|
| |
|
|
|
|
| |
Remove it from the Makefile too, unbreaking the tree.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
time_second(9) has been replaced in the kernel by gettime(9).
time_uptime(9) has been replaced in the kernel by getuptime(9).
New code should use the replacement interfaces. They do not suffer
from the split-read problem inherent to the time_* variables on 32-bit
platforms.
The variables remain in sys/kern/kern_tc.c for use via kvm(3) when
examining kernel core dumps.
This commit completes the deprecation process:
- Remove the extern'd definitions for time_second and time_uptime
from sys/time.h.
- Replace manpage cross-references to time_second(9)/time_uptime(9)
with references to microtime(9) or a related interface.
- Move the time_second.9 manpage to the attic.
With input from dlg@, kettenis@, visa@, and tedu@.
ok kettenis@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
time_second and time_uptime are used widely in the tree. This is a
problem on 32-bit platforms because time_t is 64-bit, so there is a
potential split-read whenever they are used at or below IPL_CLOCK.
Here are two replacement interfaces: gettime(9) and getuptime(9).
The "get" prefix signifies that they do not read the hardware
timecounter, i.e. they are fast and low-res. The lack of a unit
(e.g. micro, nano) signifies that they yield a plain time_t.
As an optimization on LP64 platforms we can just return time_second or
time_uptime, as a single read is atomic. On 32-bit platforms we need
to do the lockless read loop and get the values from the timecounter.
In a subsequent diff these will be substituted for time_second and
time_uptime almost everywhere in the kernel.
With input from visa@ and dlg@.
ok kettenis@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gif used its mbuf tag to store it's interface index so it could
detect loops. gre also did this, and i cut most of the drivers
(including gif) over to using the gre tag. so the gif tag is unused.
wireguard uses the tag to store peer information between different
contexts the packet is processed in. it also needs a bit more space
to do that.
from Matt Dunwoodie and Jason A. Donenfeld
ok deraadt@
|
| |
|
| |
|
|
|
|
| |
OK deraadt@, jmc@, mpi@
|
|
|
|
|
| |
it's only available on amd64 (and i386), so don't really want to
encourage it's use just yet.
|
| |
|
|
|
|
|
| |
i feel like ive used the word vector too much, but hopefully someone
who is good with english will check this and fix it.
|
| |
|
| |
|