asciilifeform: ( i can already picture mircea_popescu spitting out his breakfast, 'modern hdd dun have cylinder, you nut' -- except, it in effect DOES, fetches massive blox , whether mechanical or ssd, by design, for ages now )☟︎
asciilifeform: (lookup table consisting of, literally, rows of full txid followed by where-on-disk, that you'd chug through whenever collisionbit is set)
asciilifeform: Framedragger, ben_vulpes : i predict that there will be few enough collisions that you can shunt ~all~ collision cases off to an O(N) lookup table, for next decade, with ~0 measurable loss of avg.case performance.
asciilifeform: imho the 'hard part' is not even to implement this table, it is freshman homework, but to unravel the liquishit in trb and learn where to even put the lookup/write !☟︎
asciilifeform: now store each tx on a 4096bt 'cylinder' boundary, and you're golden
asciilifeform: btw mircea_popescu's suggestion from last night, to dispense with 'blocks' as separate class of object, and simply store sequence of tx, with 'this was in block B' field added -- would work quite well with this scheme.
asciilifeform: i was recently considering implementing a similar scheme for phuctor, and realized today that it would work entirely well in trb.
asciilifeform: if you want to consider 'what would give best performance on the iron we have, with minimal moving parts, with the trb we have' this is what comes out.
asciilifeform: so in practice, you can have O(1) seeks ~while~ storing the blocks back-to-back in a classical trb.
asciilifeform: that is, you don't need 8 bytes to say which 4kB slice has the beginning☟︎
asciilifeform: except that you don't actually need byte-addressing !
asciilifeform: however block index is not large -- say we have 500,000 blocks, and you want to know where on disk lives its beginning, and your disk takes 64bit lba addressor. so you need 500,000 * 8bytes == 4,000,000 bytes ! 3 floppy's worth
asciilifeform: fixed 1MB would give you O(1) block seek.
asciilifeform: because you can point inside'em in the index.
asciilifeform: Framedragger: the 1024bt thing works in all cases, it simply means that you don't have to point to beginnings of blocks, but to a place very near the tx
asciilifeform: considering that we have, what, 200 mil tx; and l1 table gives you 4billion slots.
asciilifeform: this form will almost certainly suffice for another decade of trb
asciilifeform: the most basic setup has an l1 which immediately proceeds to the 'slow' table (where you have to look at multiple offsets in case of collision, to find your tx)
asciilifeform: or, rather, to where-to-start-looking-for-tx
asciilifeform: then a 31bit (remember, highest bit means 'go to l2 table') offset gives you a 2^^41 bit space in which your table entry can point to the beginning of a block.
asciilifeform: say we put down blocks on 1024 byte alignments.
asciilifeform: btw Framedragger , we can do 1 better than 'block index'
asciilifeform: (and thereby a 16GB l1 table would give you wholly in-memory index for ~all tx lookups)
asciilifeform: it is statistically possible that we don't even have ONE collision yet☟︎
asciilifeform: ok, now question for ben_vulpes , trinque , mircea_popescu , et al : anybody got a quick c proggy that will eat blkcut's blocks and produce a linear list of tx ? i'd like to actually calculate the current 32bit tabular collision rate.
asciilifeform: ideally you would even have separate disks for blocks and tx indices.
asciilifeform: also this is probably obvious, but you want separate disk for this
asciilifeform: ...rather than to have any serious number crunching in kernelspace
asciilifeform: the Right Thing would probably be to have a very simple kernel driver that takes a specially-marked disk partition and gives userland trb linear use of it, as plain array☟︎☟︎
asciilifeform: (you may as well stuff it in the kernel, then)
asciilifeform: who the fuck wants trb running as root
asciilifeform: Framedragger: if you want your own fs, and to write to contiguous disk, you do
asciilifeform: i dare say the tabular algo is simple enough to actually be implementable.
asciilifeform: however mircea_popescu's 'let's patch ext4' has same.
asciilifeform: if you want to actually have the contiguous 16GB
asciilifeform: it has obvious minus, in that you need a kernel driver
asciilifeform: yes, first time, i kept waiting for someone to open a schoolbook and describe this very elementary algo
asciilifeform: you'll have somewhere between 0 and 2 , statistically, tx per 32bit table entry.
asciilifeform: Framedragger: it's good for many, many times the current cumulative-tx