Hi all,
I'm really new in Linux, especially in driver development. Now I'm trying to write some sort of ramdisk driver (similar to sbull (LDD3), brd and RapidDisk). This should however be a simulator for a real block device.
Following questions are for me now really problems.
----------------------------1--------------------
I use a "no queue" mode, i.e., directly implement make_request function. The simulated device supports concurrent IOs and have certain time latencies for READ/WRITE ops. (which should be simulated). The concurrent behaviour is based on the device structure, e.g. the device consists of two separate parts, which could process the IO separatly (and in parallel) (within a certain "part" requests are syncronous/sequential). For the delay simulation I've used udelay function (because usleep_range is very inaccurate for my purposes, i.e., 25us, 100us ...). Now the question: could the atomic udelay be used in such concurrent contex. Here is the code I'm thinking about (simplified
):
Code:
void make_request(struct request_queue *q, struct bio *bio)
{
getnstimeofday(start);
spinlock_t lock = <<find a lock for a certain "device part" based on bio>>;
spin_lock(lock);
//simulate IO ....
getnstimeofday(end);
udelay(needed - (end - start));
spin_unlock(lock);
}
Without udelay this seems to be (to my mind
) a make_request function that supports parrallel requests between device "parts"(locks for each device part).
Would udelay broke this statement? I mean: could udelay be executed in PARALLEL for two (3, 4, ...) IO requests? Would it be atomic in "current_lock" context? Or it will destroy the concurrency behaviour? (Sorry for such a dummy questions - I'm a new one in Linux).
How many parallel udelays could be executed? (#CPU ?)
----------------------------2--------------------
When is the transferred data available for user_space (i.e., for application that issues IO):
- after I've copied the data to buffer (e.g. memcpy);
- or after the call to bio_endio(bio, status) ???
this has impact on where I should put delay function.
----------------------------3--------------------
As I've understand allocating a driver memory (in my case - device memory) with vmalloc is not a good idea (bcz of mapping overhead), especially if I need huge amound of data (up to 50 Gb). Would alloc_page (indexed with RADIX_TREE) work fine for 50Gb (of course under the condition of availability of such amount of RAM)? Are there other variants?
Thanks a lot for your help?
Best,
Tim