201800+ entries in 0.121s

mircea_popescu: wait, putin bombed his own loyal subjects ? what is he supposed
to be, bahamas ?
a111: Logged on 2017-04-05 18:06 Framedragger: quite sure
that it's possible
to dump dom in selenium but in any case, yes dumping it seems like a prerequisite, like what archive.is does and what phf said above
phf:
http://btcbase.org/log/2017-04-05#1638105 << dumping dom is not hard (it's innerhtml attribute on document.body/head), extracting images/objects is harder. used
to be able
to do it using canvas, but not
there's security provisions. you literally have
to go into
the browser's cache (via internal apis)
to got
the graphic, etc. back. or else redownload, which is rife with potential issues
☝︎☟︎ Framedragger: mircea_popescu: lenin returned
to russia 100 years ago in april. COINCIDENCE?
Framedragger: i wonder if "can run archive requests on 'uncleaned' (i.e.: already possibly infected) VM" could be allowed for. it's not exactly a gpg-signed-msg
timestamping service.
Framedragger: asciilifeform: do you mean
that "single archive request handled by dedicated process which
terminates at end of request" wouldn't be enough in
terms of cleanup? (due
to js as vector of attack
to machine?)
mircea_popescu: danielpbarron what format is
the mysql database stored as, sql ? its native binary whatever it is ?
☟︎ Framedragger: quite sure
that it's possible
to dump dom in selenium but in any case, yes dumping it seems like a prerequisite, like what archive.is does and what phf said above
☟︎ mircea_popescu: part of
the problem being
that common browser instructed
to "Save page" will proceed
to save
the html rather
than
the dom, or however you call
the ast in www.
mircea_popescu: i suppose in principle could exist as an "proper archival browser extension". i suppose very close
to what ppl like browsershots etc do
Framedragger: (and
that use case existing is good signal in
this case)
Framedragger: CompanionCube:
that's just one use case,
to be clear
Framedragger: worx. (heavier
than scrapy in
that can handle 'need js
to click on 'next' button' logic)
mircea_popescu: Framedragger would be nice, but conceivably you'd be stuck bolting js
to links or such
mircea_popescu: else any random rustard can
throw a wrench in your whole
thing. "gotta support fdlkgjkldfjl!!~11"
mircea_popescu: yeah. it sounds like
the right
thing but in
the process of so sounding hides under a welded shut hood all
the design
trade-offs which'll need
to be made.
mircea_popescu: i suppose
this is part of
the problem of "wtf were
the design goals again ?" -- swiss knife syndrome, many
things for many people at many different
times.
mircea_popescu: but
the majority of
those webpages were republican rather
than imperial.
mircea_popescu: at least
to my eyes, archiving
to date was
there
to provide a sort of cheaper and lighter deedbotting for webpages.
mircea_popescu: honestly i'm not even sure i actually want
to archive, eg, fake news sites.
doppler: ^--
that's what I was referring
to.
mircea_popescu: well i didn't want
to archive "artists" flash bs in 2007 either
mircea_popescu: doppler flash is stiull live and well in
the "browsere games" niche. heck, adobe recently made new linux flash (after years of stfu)
CompanionCube: asciilifeform: problem is
that ipads will display different
to desktop
mircea_popescu: im not even sure wtf
the design goals would be, which is why we're wallowing
doppler: I'm so glad flash player sites are mostly gone now, but JS has just
taken its place really
CompanionCube sometimes wishes we could banish entirely-JS 'websites' from
the face of
the earth
mircea_popescu: half literate fucktards, all
typefaces are new
to
them all
the
time.
mircea_popescu: "why
THE FUCK do you want
to mess with
the fonts" "because we
truly have nothing
to do with our
time" "Do you understand reading speed decreases 3x if
the reader has
to deal with unfamiliar
typefaces ?" "uuga booga"
mircea_popescu: (this happens a lot more
than anyone sane would on his own power imagine)
phf: right, not
to mention competing rendering engines ("system wide library doesn't kern glyphs
the way our ui designer
thinks is appropriate so we do it ourselves!!1")
mircea_popescu: it's
the way of
the future ; everything is connected etc.
phf: a sentence like "hello world
this is
test" might get an invocation like render("world
this") followed by render("hello") followed by render("is
test"), simply because higher level widget engine decided
that's
the order of exposure, or hierarchy, or whatever
mircea_popescu: phf provided eg "java runtime fonts" aren't
there on
top of x11-fonts etc
phf: asciilifeform:
there's ~one~
true
type library, but it's called at random
times
to render a small part of
the page, so by simply following
the invocations you won't be able
to reconstruct how
the individual results fit into
the on-screen
mircea_popescu: sorry i can't hear you over
the sound of hillary clinton's pantsuit.
doppler: what about
the website-viewer's right
to make
their own decisions?
mircea_popescu: doppler but you know, when people get self-determination and
the right
to make
their own decisions, everything improves.
doppler: too bad web developers ever received
the power
to control
the rendering of
their sites so closely
Framedragger: mircea_popescu: not familiar / wouldn't know. my exposure
to
the whole
thing was literally just "found relevant library; hop on irc
to ask
the author; chat for a while; realise he's blind; ask about his experience"
phf: but it doesn't invovoke
truetype ~as one last pass~. instead you have fifty different
truetype invocations
to lay out a small surface,
that's placed into an hierarchy of such surfaces
mircea_popescu: phf> state of
the art for blind folx is misserable. << about same as in 1997.
phf: asciilifeform:
that is not
though how rendering works on "modern os". unlike x
there's no central authority on glyph rendering. instead you have layers of surfaces
that each app manages on its own.
phf: basically agent on
top of internet explorer/others
that, with a specially annotated page (ARIA standard) can make
the experience usable
Framedragger: i can ask one such folk ('camlorn on #libaudioverse - has his own 3d audio library, competent at what he does). i recall him explaining
the shitshow
that was getting cs degree by
translating cs paper pdf (horror)
phf: state of
the art for blind folx is misserable.
mircea_popescu: asciilifeform
the ocr idea is miserable, sadly, because so many glyphs and nonsense.
mircea_popescu: asciilifeform> ideally you want it running in a qemu-like
thing with randomly-generated instruction set. <<
this may be overkill. could as well run it in a rom-os machine or something.
phf: doesn't have
to contain rather. i'm sure
they don't scrape it diligently enough.
Framedragger: asciilifeform: as phf said, archive.is output does not contain js. just
to clarify.
mircea_popescu: ah i see alf addressed. it's an exercise in
typical expertsexchange wankery,
total misunderstanding of engineering etc.
Framedragger: phf: ah, but i meant
the initial rendering phase -
the 'archive
this plz' process itself. but
thanks for clarifying yeah
phf: Framedragger: yes, archive.is is a headless webkit. it loads
the page with all
the resources, it lets
the javascript run until
the DOM is in some "final" state, it snapshots
the DOM. at
this point you no longer need javascript
to further render page
☟︎ mircea_popescu: ben_vulpes it's not measurably bad.
theoretically it weakens
the problem of "spend somone else's inputs".
phf: Framedragger: you don't have
to expose
the user
to js with archive.is approach
Framedragger: so, if
tmsr were
to have its own archiver, i don't
think archive.is' approach is
the way
to go, even
though it is (arguably, maybe)
the most 'reliable' / 'true' (actual js rendering in browser). exposing user
to JS defeats half its purpose. imho.
phf: also,
these days you have js packing frameworks (like browserify) doing frontend on-demand js loads, which means
that ~browser~ doesn't know when page load is complete. you get "lightweight" pages,
that don't actually have anything until 8mb of javascript gets its shit
together.