log☇︎
3100+ entries in 0.024s
Framedragger: and the slow-table.. hm
Framedragger: oh
Framedragger: as to*
Framedragger: there's an assumption as max num of collisions here, of course, but obvs in practical terms it's a very safe assumption...
Framedragger: yeah, makes sense to me (on average, current likelihood of particular 32 bit entry being populated is ~ < 6%)
Framedragger: hmh, at least functions make up not the *worst* interface seen, but still lotsa work and weird mutable shit sprayed all around, i imagine
Framedragger: yeah that's one reason i'm not too attracted to trb, tbh, the amount of sewage gruntwork required to decouple shit from the monolith.
Framedragger: decent ideas, what can i say
Framedragger: aha right.
Framedragger: i guess one can imagine a single sequence of tx then, simply.
Framedragger: it's a really Good Thing that the hashing function which spits out transaction hashes gives *uniform distribution*. no congestion / too many collisions expected, and this scheme leverages that.
Framedragger: (i see how good it is to be aware of how actual disks read data here. some theoretician would propose a pointer-exact-location scheme instead...)
Framedragger: ahh, yeah okay, back-to-back you mean exactly that, not having to allocate 1MB per block.
Framedragger: yeah i forget sometimes. fixed block length is nice for this...
Framedragger: well *that sounds like a very decent idea*. :)
Framedragger: yeah.
Framedragger: yeah, given actual tx amounts.. 250mn vs. 2^32
Framedragger: right! ahh that's nice. (so just to clarify, the 1024 byte block trick wouldn't work if there's a collision (unless additional budget / w/e))
Framedragger: in this case.
Framedragger: bear with my slowness, can you clarify how it looks like if there's a collision in the initial lookup?
Framedragger: as in, disk/partition alignments?
Framedragger: sure
Framedragger: makes sense.
Framedragger: have separate service taking care of that? i mean, kernel driver is this kind of 'externality', too (and also ring0)
Framedragger: it uses `lseek64()`
Framedragger: here i have an ssd seek profiler which just needs root
Framedragger: i'm not sure if you do need driver ☟︎
Framedragger: at least i have the excuse of not having looked at the bdb problem / staying away from trb for the time being :p
Framedragger: is this the first time you articulated this approach here? i think that's the best on can have for fs-tx-db
Framedragger: this is quite nice, and as you say, seek operation already gives a small chunk which should cover most/all tx for current state of affairs (total number of transactions)...
Framedragger: oh i finally understood, literally all there is when one seeks to location 3ec455a2 is a list of block numbers. (or single block number.)
Framedragger: why the need for "the machine might have to try 2 or 3 blocks before it finds tx" then? and if so, then no guarantee of only 1 seek?
Framedragger: asciilifeform: wait, what is "block index"? just the integer denoting block number?
Framedragger: efficient seeking.
Framedragger: asciilifeform: hmm, very nice. i suppose it's as close to fixed-length as is possible given current bitcoin
Framedragger: trinque: it's just a kindergarten way of wrapping up some syscalls. will obviously benchmark outside it later. i wasn't completely certain that my tool wouldn't trash the host fs. :)
Framedragger: sorry.
Framedragger: (right?)
Framedragger: aha, right! so it's basically a (small) hashtable.
Framedragger: (yeah btw, just ftr, symlink *creation* under populated dir structure (`ln -s files_f1/block35461.txt dc/dc89c1f2b58909d3814b250a731a9b9b791b092759553e3ba6579ffaad3a7565`) is slow. however, the creation was done using shellscript, need to move to c to be able to actually profile with precision.) ☟︎
Framedragger: aha, right.
Framedragger: so the 'matching' (index lookup) is the 99% here, right?
Framedragger: for now, just generated 1mn symlinks with names corresponding to transaction hash hex.
Framedragger: what i want to do later when i find time is, actually read file, too, of course.
Framedragger: aha, right. i haven't even looked at bdb.
Framedragger: will get a way to test real disk soon, didn't want to run on personal trashy PC, hence shitty server
Framedragger: note, it's just some additional syscalls, re docker
Framedragger: uhh
Framedragger: ext4, yes
Framedragger: ssd under docker fs, later - real disk
Framedragger: yes, ssd
Framedragger: asciilifeform: call to `readlink()`.
Framedragger: no
Framedragger: sorry - yeah
Framedragger: orly? this is *ns* (10^-9), mind you. hm. and this is just resolution of path with single symlink in it
Framedragger spent longer than wants to admit sorting out his heap and valgrind'ing. too much python is bad for a person
Framedragger: getting ~4000-7000ns for symlink resolution to real path for a 1mn symlink dir structure, e.g.
Framedragger: http://btcbase.org/log/2017-03-10#1624048 << feltbad, so wrote that stupid symlink and fs profiling tool that no-one wanted to do. results later. while at it: anyone knows if CLOCK_MONOTONIC has sufficient resolution for profiling? asciilifeform? allegedly - yes. ☝︎☟︎
Framedragger: myeah #trilema is basically valgrind.
Framedragger: o/
Framedragger: i like my rc airplanes. "the will of history necessitates you to X" has a marx'ified hegelian vibe :p ☟︎☟︎☟︎
Framedragger: mircea_popescu: basically, and that's strictly it - because i couldn't intuitively wrap my head around the fact that average number of nodes per specific folder would be _really_ low if depth is say more than 3. still weird in my head, but yeah.
Framedragger: ..and so she goes, http://www.reuters.com/article/us-southkorea-politics-idUSKBN16H066
Framedragger: (deeper path => slower traversal but fewer nodes per folder, up to the point where e.g. 'fast symlinks' can be used by ext4 (http://lxr.free-electrons.com/source/fs/ext4/inode.c#L148) - maybe; etc.)
Framedragger: << (obviously these'd be more useful with actual empirical numbers of average/median seek times, writes, seek/write as things get congested, etc.)
Framedragger: 2) http://fd.mkj.lt/stuff/fsgraph2.png
Framedragger: d'oh! thanks.
Framedragger: (really kindergarten level simple but wanted to see this myself, could be useful for reference - unless it's incorrect..)
Framedragger: 2) http://fd.mkj.lt/stuff/fsgraph1.png - up till 10**24 (which is when avg number of nodes per folder reaches 1000 for total depth of 8)
Framedragger: 1) http://fd.mkj.lt/stuff/fsgraph1.png - up till 100bn objects (to compare, current number of bitcoin transactions ~= 0.2bn)
Framedragger: re. fs nodes, couldn't sleep + not sure if this makes sense, so just throwing these out - barebones super simplistic (function is `n_objects_to_store ^ 1 / folder_depth`) plots showing expected average number of nodes per folder (assumptions are no bias in hashspace and also equal share of hash bits per folder level) - it may not be intuitive how low the averages are until you look:
Framedragger: /me probably off till (maybe much) later
Framedragger: ^ the above is plain-obvious, but just ftr.
Framedragger: assuming equally distributed transaction hashspace, if you want your tree to fill up with 1000 nodes on average per given depth, you'd be storing 10^24 transactions. but this assumes that every folder depth gets assigned equal number of bits to represent, of course.
Framedragger: for symlink fs testers (or maybe selfnote for later): note that if you allow for sufficient folder tree depth, the "1000s of symlinks per dir" won't realistically happen when storing, say, bitcoin transaction hashes. the latter have 256 bits => 64 hex chars. if you allow for depth of 8 where last level (8) is symlink itself, you get 32 bits per folder level.
Framedragger: # time ls real 0m15.146s"""
Framedragger: # echo 3 >/proc/sys/vm/drop_caches
Framedragger: real 0m0.069s user 0m0.000s sys 0m0.068s
Framedragger: # time ls
Framedragger: # cd /tmp
Framedragger: Take a look at this:
Framedragger: also, """But I appear to have a lingering effect that seems to have started from the time my /tmp directory had the millions of files in it.
Framedragger: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=771573 << dude runs into weird cups printing issue which creates millions of symlinks in /tmp as side effect (...). side effect of *that* (well, presumably that) is system fails to boot. because of course.
Framedragger: jolly lil' project.
Framedragger: yeah, okay; as long as it's not fixed-width trb-i, no way around this.
Framedragger: hmh right, right. no way around it, i guess.
Framedragger: right, that part is cool.
Framedragger: would this be performant enough even theoretically, given no way to use offsets?
Framedragger: mircea_popescu: re. http://btcbase.org/log/2016-12-22#1588180 , for my elucidation, so would the symlinks just point to a particular 1MB block file? ☝︎
Framedragger: (but maybe you covered that, too, and i forgot in logs.)
Framedragger: but with transactions, you can be sure that once it returns, it will have written to disk. fsync can still be on. (iirc).
Framedragger: myeah
Framedragger: (and if you now say 'db is lost cause anyway' while not linking to code/config *again*, i'll grit teeth angrily)
Framedragger: but i do hope you're doing the former, i mean i assumed so. that's the lowest-hanging fruit re. 'how do i do batch writes to db'
Framedragger: asciilifeform: on top of 'transactions', postgres has 'checkpoint' parameter. but you probably won't like it because of the whole 'not turning off fsync' thing
Framedragger: ya, ok.
Framedragger: instead, it's "just" a matter of having a however-deep directory tree with symlinks as the leaves.
Framedragger: for a minute i thought (don't know why) that what is *additionally* needed is the capability to have paths of /symlinks/to/symlinks/.
Framedragger: oh wait, i phrased this incorrectly while at the same time horribly mis-reading: sorry, this is about max depth of path composed of symlinks.
Framedragger: well there ya go. :)