You may be looking for one of the following things

🔗 Search Engine
🔗 Encyclopedia (not mobile-friendly!)
🔗 Website Explorerimproved
🔗 Similar Website Finder
🔗 Server Status

My name is Viktor. I’m a Swedish software engineer and hypertext enjoyer. Marginalia is a website I’ve built. It’s really almost a bunch of websites on a common theme. If you find yourself clicking a link and ending up on a page that looks completely different, that’s just how things are.

🌎 Marginalia Search on GitHub
ðŸĶĪ @MarginaliaNu on Twitter
ðŸĶĪ @marginalia@mastodon.social
📚 @ViktorLofgren on YouTube
✉ïļ kontakt@marginalia.nu on Email

Site Index

📁 Weblog/2024-05-16
💭 Problems/2024-04-03
📁 Release Notes/2024-01-24
📁 Recipes/2023-08-31
🔧 Server Status Log/2023-08-27
📁 Miscellaneous/2023-05-08
📁 Marginalia Search/2023-03-28
📁 Links/2022-09-15
ðŸĪ– Weird AI Crap/2022-08-01
📄 Uses2024-02-01

Recent Updates

2024-05-16 Experiment in Java native calls in log
I’ve experimentally replaced some of the Java implementations of quicksort and binary search with calls to C++ code, and saw huge benefits for the sorting code but the same or worse performance for binary search. The Marginalia Search engine is mainly written in Java, which is language that is good at many things, but not particularly pleasant to work with when it comes to low level systems programming. Unfortunately, a part of building an internet search engine involves database-adjacent low level programming.
2024-05-05 Using DuckDB to seamlessly query a large parquet file over HTTP in log
A neat property of the parquet file format is that it’s designed with block I/O in mind, so that when you are interested in only parts of the contents of a file, it’s possible to some extent to only read that data. Many tools are aware of this property, and DuckDB is one of them. Depending on which circles you run in, a lesser known aspect of HTTP is range requests, where you specify which bytes in a file to be retrieved.
2024-04-17 Query Parsing and Understanding in log
Been working on improving Marginalia Search query parsing and understanding. This is going to be a pretty long update, as it’s a few months’ work. Apart from cleaning up the somewhat messy query parsing code, a problem I’m trying to address is that the search engine is currently only good at dealing with fairly focused queries, they don’t need to be short, but if you try to qualify a search that is too broad by adding more terms, it often doesn’t produce anything useful.
2024-04-10 Deep Bug in log
The project has been haunted by a mysterious bug since sometime February. It relates to the code that constructs the index, particularly the code that merges partial indices. In short the search engine constucts the reverse index through successive merging of smaller indices, which reduces the overall memory requirement. You can conceptualize the revese index itself as two files, one with offset pointers into another file, which has sorted numbers. This code runs after each partition finishes crawling and processing its data, and has a run time of about 4 hours.
2024-04-03 2 Sigma Education in problems
It is a pretty well known fact that if you give kids one-on-one tutoring from experts, they outperform classroom-educated students by an absurd two standard deviations. The effect size is absolutely mind-boggling, and a damning brand of indictment against the entire modern education system. The problem is that this stuff clearly doesn’t scale, for two reasons; first of all there isn’t enough experts to go around, as it would require a large chunk of the adult population to be experts dedicated to tutoring, and second of all, even if this was the case, having a dozen or more teachers per student would trivially be something like 300x more expensive than having 20-30 students per teacher.


🏷ïļ ai/3
🏷ïļ bots/4
🏷ïļ cooking/6
🏷ïļ memex/2
🏷ïļ moral-philosophy/7
🏷ïļ nlnet/13
🏷ïļ platforms/9
🏷ïļ programming/19
🏷ïļ satire/4
🏷ïļ search-engine/61
🏷ïļ server/2
🏷ïļ web-design/12