I made a nodejs script that reads several GB of XML, performs some computation, and stores millions of records in MongoDB.
Shit takes over an hour to complete.
Would Bun be any faster?
Or it is just HTTP requests that it can do faster, not raw JavaScript parsing and number crunching?
How are you doing that? I hope you are using the stream API anon.
I use streams to read the ZIPs and parse the XML, yes.
Parsing the XML takes the longest time.
It's gigabytes of zipped XML, extracted it would be about 50GB I think (but I never store the unpacked versions)
My real question is: would the JS engine have a significant effect on performance or are they all pretty much the same?
It's not a huge issue because I only have to run the full script a few times a year and I just leave my computer on overnight but it would be nice if I can bring it down to under an hour just by switching to a different platform.
when comparing v8 to jscore, jscore is a few % faster, so you're gonna see some benefits but nothing that would make a multi-hour operation take under an hour.
the other benefits to look at is that bun's FFI is faster and less bloated than node's, so while there IS potential here, it'd require you to use packages that were designed to take advantage of bun, so it's not a drop-in replacement either. you should benchmark what's the slowest component, switch to bun, and replace just that slow component with bun's equivalent.
>Parsing the XML takes the longest time.
Are you reading the XML into a DOM tree and then traversing it? Instead you should try event-based parsing, if possible.
I use event based parsing.
I think parsing XML is just a lot slower than parsing JSON.
I think XML was just never meant for such large data sets (it's a database of all real estate in my country including their geometry, I use it to generate maps)
The real issue is the organization publishing that data refuses to use a sane data format.
>Parsing the XML takes the longest time.
Are you reading the XML into a DOM tree and then traversing it? Instead you should try event-based parsing, if possible.
>just use callbacks bro >calling slow js each step step of the deserialization is totally faster >why would you just parse it and then pick out what you want
NTA but youre an idiot. this only works for seekable streams being handled by a callback system that understands 'piss off processing this block.' XML isnt a structured format that supports this. insertion into an object/map probably isn't the bottleneck; and if anything, subjecting the parsers loop to your shitcode would be. i hate high level code monkey morons. stop mindlessly repeating what you hear other people say.
>reading gigabytes of XML into memory and then traversing it to find the one or two elements you need is much better than being notified by the parser when an element is encountered, then checking to see if it’s the ones you’re interested in and performing calculations accordingly
[...]
I use streams to read the ZIPs and parse the XML, yes.
Parsing the XML takes the longest time.
It's gigabytes of zipped XML, extracted it would be about 50GB I think (but I never store the unpacked versions)
My real question is: would the JS engine have a significant effect on performance or are they all pretty much the same?
It's not a huge issue because I only have to run the full script a few times a year and I just leave my computer on overnight but it would be nice if I can bring it down to under an hour just by switching to a different platform.
you know what people say about using the right tool for the job? Well, you didn't
just the fact that I don't have to go through the retarded TS boilerplate to start a project is enough for me to switch. everything else is just icing on the cake.
What the fuck is a bros?
Tested it.
It fucking sucks.
It cannot even serve a simple JSON response out of the box.
brainlet
const path = "/path/to/package.json";
const file = Bun.file(path);
const contents = await file.json();
I'm not seeing any code related to serving a fucking JSON response in your post, just reading a JSON file and that's it.
See? Bun fucking sucks.
Bun.serve()
how about you read the docs you fucking retard
https://bun.sh/docs/api/http
wtf, they have discord embedded in the error modal.
I'm fully convinced these new tools and frameworks serve no other purpose other than making zoomers feel special/represented.
It's a massive performance and DX boost but ok
>much performant boost wow
How about you just write better code instead of using tools to cope with your shit logic
It's not the code that is slow retard, it's the tooling and the runtime that is slow
u never said anything about HTP
>serve a ... response
What was your interpretation?
wouldn't something like this work?
const server = Bun.serve({
port: 3000,
fetch (req) {
return Response.json({your: 'mother'});
},
});
of course it does but evidently he can't fucking read so you shouldn't even have bothered to write this
What to do when Zig inevitably breaks compatibility? Rewrite it?
>inevitably
It won't.
If you really gonna hate on Zig, don't be an idiot and make valid criticisms.
I really didn't like Zig so far. We had enough Zig hate threads to not bother repeat.
not 1.0, not production ready which andrew will tell you. miscompiles all the time
>It won't
go write some async code with 0.11.0
can't? why not? it worked before. they didn't break compatibility after all
>interpreted garbage for my backend
The battle was already won by Golang
I made a nodejs script that reads several GB of XML, performs some computation, and stores millions of records in MongoDB.
Shit takes over an hour to complete.
Would Bun be any faster?
Or it is just HTTP requests that it can do faster, not raw JavaScript parsing and number crunching?
Sounds like a (you) problem. What you're describing shouldn't take more than 5 minutes, worst case scenario.
I use streams to read the ZIPs and parse the XML, yes.
Parsing the XML takes the longest time.
It's gigabytes of zipped XML, extracted it would be about 50GB I think (but I never store the unpacked versions)
My real question is: would the JS engine have a significant effect on performance or are they all pretty much the same?
It's not a huge issue because I only have to run the full script a few times a year and I just leave my computer on overnight but it would be nice if I can bring it down to under an hour just by switching to a different platform.
when comparing v8 to jscore, jscore is a few % faster, so you're gonna see some benefits but nothing that would make a multi-hour operation take under an hour.
the other benefits to look at is that bun's FFI is faster and less bloated than node's, so while there IS potential here, it'd require you to use packages that were designed to take advantage of bun, so it's not a drop-in replacement either. you should benchmark what's the slowest component, switch to bun, and replace just that slow component with bun's equivalent.
Thanks, I guess I won't bother with it then.
I use event based parsing.
I think parsing XML is just a lot slower than parsing JSON.
I think XML was just never meant for such large data sets (it's a database of all real estate in my country including their geometry, I use it to generate maps)
The real issue is the organization publishing that data refuses to use a sane data format.
>Parsing the XML takes the longest time.
Are you reading the XML into a DOM tree and then traversing it? Instead you should try event-based parsing, if possible.
>just use callbacks bro
>calling slow js each step step of the deserialization is totally faster
>why would you just parse it and then pick out what you want
NTA but youre an idiot. this only works for seekable streams being handled by a callback system that understands 'piss off processing this block.' XML isnt a structured format that supports this. insertion into an object/map probably isn't the bottleneck; and if anything, subjecting the parsers loop to your shitcode would be. i hate high level code monkey morons. stop mindlessly repeating what you hear other people say.
>reading gigabytes of XML into memory and then traversing it to find the one or two elements you need is much better than being notified by the parser when an element is encountered, then checking to see if it’s the ones you’re interested in and performing calculations accordingly
>parsing 50gb of xml in javascript
this lol
you know what people say about using the right tool for the job? Well, you didn't
How are you doing that? I hope you are using the stream API anon.
just the fact that I don't have to go through the retarded TS boilerplate to start a project is enough for me to switch. everything else is just icing on the cake.
>just use 40 times more memory than rust and node.js bro trust me it's fast
(checked)
Imagine being a RAMlet in 2776 AUC.
I was actually just subtly asking LULZ if that's a bad thing or not because I want to use bun but don't know if that much ram is bad