Minor fixes after the merge
Deleting some duplicate functions and header
This commit is contained in:
62
README.md
62
README.md
@@ -1,6 +1,6 @@
|
||||
# filehasher
|
||||
|
||||
## Presentation
|
||||
# Presentation
|
||||
Collects some metadata and hashes files. It outputs the path, hash, size, creation and
|
||||
last modification dates and the author in file_hasher.txt.
|
||||
Creation and modification dates and author can be disabled in the config file.
|
||||
@@ -19,9 +19,9 @@ It is a high performance cross platform Windows and Linux compatible program, it
|
||||
It can be disabled in the config file.
|
||||
* Fallback to buffered I/O if there is errors in the IO Ring path.
|
||||
|
||||
## Building
|
||||
### Windows
|
||||
#### Release
|
||||
# Building
|
||||
## Windows
|
||||
### Release
|
||||
|
||||
**Note**: Make sur to use UCRT64 environment from MSYS2 instead of the standard MinGW environment.
|
||||
UCRT64 uses the modern Universal C Runtime (ucrtbase.dll), which supports the newest APIs,
|
||||
@@ -37,24 +37,24 @@ gcc -O3 file_hasher.c xxhash.c xxh_x86dispatch.c -o file_hasher
|
||||
clang -O3 file_hasher.c xxhash.c xxh_x86dispatch.c -o file_hasher
|
||||
clang-cl /O2 file_hasher.c xxhash.c xxh_x86dispatch.c
|
||||
|
||||
#### Debug
|
||||
### Debug
|
||||
gcc -g -O0 file_hasher.c xxhash.c xxh_x86dispatch.c -o file_hasher
|
||||
clang -g -O0 file_hasher.c xxhash.c xxh_x86dispatch.c -o file_hasher
|
||||
clang-cl /Zi /Od file_hasher.c xxhash.c xxh_x86dispatch.c
|
||||
|
||||
### Linux
|
||||
#### Release
|
||||
## Linux
|
||||
### Release
|
||||
gcc -O3 file_hasher.c xxhash.c xxh_x86dispatch.c -pthread -luring -o file_hasher
|
||||
clang -O3 file_hasher.c xxhash.c xxh_x86dispatch.c -pthread -luring -o file_hasher
|
||||
|
||||
#### Debug
|
||||
### Debug
|
||||
gcc -g -O0 file_hasher.c xxhash.c xxh_x86dispatch.c -pthread -luring -o file_hasher
|
||||
clang -g -O0 file_hasher.c xxhash.c xxh_x86dispatch.c -pthread -luring -o file_hasher
|
||||
|
||||
## Notes about the IO Ring implementations
|
||||
### IO Ring
|
||||
# Notes about the IO Ring implementations
|
||||
## IO Ring
|
||||
|
||||
#### File registration
|
||||
### File registration
|
||||
Registering files is a performance optimization that allows the kernel to allocate an array
|
||||
of descriptors/handles to pre-validate and maintain long-term references to file handles.
|
||||
Instead of passing a standard file descriptor/handle with every I/O request, you pass a simple integer
|
||||
@@ -66,7 +66,7 @@ use io_uring_register_files_update() to update one or more entries. Windows on t
|
||||
is limited to BuildIoRingRegisterFileHandles() only, so we need to re register the entire array of handles
|
||||
each time. This is why there is a provided macro in config.h to disable or enable it.
|
||||
|
||||
##### Why Register Files? (The Benefits)
|
||||
#### *Why Register Files? (The Benefits)*
|
||||
When you use a standard file descriptor in a high-frequency I/O loop,
|
||||
the kernel must perform several "hidden" tasks for every single operation:
|
||||
* Permission Checks: Validating that the process still has the right to read/write
|
||||
@@ -80,16 +80,17 @@ Registering the files performs these checks once at registration time. Subsequen
|
||||
I/O operations skip these steps, significantly reducing CPU overhead and latency,
|
||||
especially when handling thousands of small I/O operations per second.
|
||||
|
||||
##### Comparison: Linux vs. Windows Implementation
|
||||
#### *Comparison: Linux vs. Windows Implementation*
|
||||
While both systems share the same core concept, their APIs and management styles differ significantly.
|
||||
Feature Linux (io_uring) Windows (IoRing)
|
||||
API Call io_uring_register BuildIoRingRegisterFileHandles
|
||||
Registration Method Synchronous system call that blocks until the table is set up. Asynchronous request submitted to the ring just like a read/write operation.
|
||||
Partial Updates Supports IORING_REGISTER_FILES_UPDATE to swap specific indices without a full reset. Does not support partial updates; a new registration call replaces the entire existing table.
|
||||
Memory Mapping User must manually mmap() the queues into their address space. The kernel handles memory mapping automatically when the ring is created.
|
||||
Scope of Operations Extremely broad (files, sockets, timers, signals, even other rings). Primarily focused on file storage (read, write, flush).
|
||||
|
||||
#### Completion Wait count
|
||||
| Feature | Linux (`io_uring`) | Windows (`IoRing`) |
|
||||
| :--- | :--- | :--- |
|
||||
| **API Call** | `io_uring_register` | `BuildIoRingRegisterFileHandles` |
|
||||
| **Registration Method** | Synchronous system call; blocks until the table is set up. | Asynchronous request submitted to the ring like a read/write operation. |
|
||||
| **Partial Updates** | Supports `IORING_REGISTER_FILES_UPDATE` to swap specific indices. | No partial updates; a new registration replaces the entire table. |
|
||||
| **Scope of Operations** | Extremely broad (files, sockets, timers, signals, etc.). | Primarily focused on file storage (read, write, flush). |
|
||||
|
||||
### Completion Wait count
|
||||
To avoid busy waiting when receiving CQEs, we can use io_uring_submit_and_wait() in Linux by entering a wait count,
|
||||
the threads sleeps until the count of CQEs are received, in windows the wait_count is present in SubmitIoRing()
|
||||
but is not implemented yet, so we wait with a completion event for a single completion. Another limitation on the completion
|
||||
@@ -97,7 +98,7 @@ event is that the kernel will waik up the thread only when receiving the first C
|
||||
queue completely before sleeping again, or we enter an eternal slumber. And my config, each time the thread wakes up
|
||||
it receives rarely more than 3 to 5 CQEs and most of the time only one CQE.
|
||||
|
||||
#### Filtering CQEs
|
||||
### Filtering CQEs
|
||||
|
||||
Unlike Linux, The Windows implementation treats buffer and file registration
|
||||
as an asynchronous operation that we submit to the ring, similar to a read or write.
|
||||
@@ -108,9 +109,9 @@ cqe.UserData == USERDATA_REGISTER
|
||||
continue;
|
||||
```
|
||||
|
||||
### io_uring
|
||||
## io_uring
|
||||
|
||||
#### Creation flags
|
||||
### Creation flags
|
||||
io_uring provides a lot of configuration flags compared to IO Ring, some
|
||||
of them are at the creation and others during the operations, here what
|
||||
we use in this implementation at creation time and is lacking in the
|
||||
@@ -122,7 +123,7 @@ IO Ring implementation.
|
||||
is ready, we use this flag to disable this syscall and wait for a specific number of
|
||||
CQEs to be ready to group them, this reduces the number of syscall.
|
||||
|
||||
#### Memlock limit warning
|
||||
### Memlock limit warning
|
||||
|
||||
```c
|
||||
"WARNING: Buffer registration failed due to memlock limits (ENOMEM).\n"
|
||||
@@ -136,7 +137,7 @@ And registering buffers will lock the buffers memory so the hardware
|
||||
can access it directly without kernel intervention and prevents the kernel from
|
||||
swapping it to the SSD or HDD. Increase the limit to be able to register the buffers.
|
||||
|
||||
##### Modifying the Limit:
|
||||
#### *Modifying the Limit*
|
||||
The method for changing the memlock limit depends on whether you are
|
||||
managing a user session or a system service.
|
||||
1. For Users and Interactive Sessions
|
||||
@@ -147,7 +148,8 @@ the /etc/security/limits.conf file. Add the following lines:
|
||||
# Example for a specific user (replace 'username'), unlimited or a custom value in KB
|
||||
username soft memlock unlimited
|
||||
username hard memlock unlimited
|
||||
|
||||
```
|
||||
```conf
|
||||
# Example for all users
|
||||
* soft memlock unlimited
|
||||
* hard memlock unlimited
|
||||
@@ -169,7 +171,7 @@ systemd. To increase the limit for a service, edit its service file
|
||||
LimitMEMLOCK=infinity
|
||||
```
|
||||
|
||||
##### Why Register Buffers?
|
||||
#### *Why Register Buffers?*
|
||||
In a standard "unregistered" I/O operation, the kernel must perform several
|
||||
expensive steps for every single read or write:
|
||||
* Virtual-to-Physical Mapping: The kernel has to translate your application's
|
||||
@@ -182,7 +184,7 @@ expensive steps for every single read or write:
|
||||
|
||||
Registering the buffers performs all of this "pinning" and "mapping" once.
|
||||
|
||||
#### Direct I/O: O_DIRECT (Linux) and FILE_FLAG_NO_BUFFERING (Windows)
|
||||
### Direct I/O: O_DIRECT (Linux) and FILE_FLAG_NO_BUFFERING (Windows)
|
||||
|
||||
Modern operating systems normally use a page cache when reading files. This means file
|
||||
data is first loaded into kernel memory and then copied to user space. While this improves
|
||||
@@ -195,7 +197,7 @@ Windows: FILE_FLAG_NO_BUFFERING
|
||||
|
||||
These flags instruct the OS to transfer data directly between disk and user-provided buffers, avoiding the page cache.
|
||||
|
||||
##### Benefits
|
||||
#### *Benefits*
|
||||
1. Reduced memory overhead
|
||||
Avoids polluting the OS page cache
|
||||
Especially useful for large sequential reads (e.g. hashing, backups)
|
||||
@@ -210,7 +212,7 @@ Prevents cache contention between threads
|
||||
5. Avoids double caching
|
||||
Important when the application already manages its own buffering
|
||||
|
||||
##### File system compatibility
|
||||
#### *File system compatibility*
|
||||
Not all file systems are compatible with O_DIRECT, if we try to open files residing in an NTFS partition,
|
||||
most of the time it will fail, and some times it opens but the CQEs return with an error code bad
|
||||
descriptor, and it causes some lags.
|
||||
|
||||
Reference in New Issue
Block a user