Wiki Devblog - April 29 through May 12

bug fixes (top)

We'd earlier coded custom a custom appearance for <video> and <audio> elements embedded into content (like commentary). Like images, these support a handy align="center" attribute, but apparently we never tested this - it wasn't even recognized in the content parsing code. It came up during data review and we got that attribute working like it's meant to!

During the implementation of content entries as proper data objects, we renamed a property artistDisplayText to just artistText. This property lets you show custom text different from the list of artists actually credited for a commentary entry:

Custom artist text. The artists for this commentary entry are written as Lilithtreasure, Witch's Cadence, then followed by a pipe and just one wiki link, pointing to Witch's Cadence. On the website it displays only that link.

We'd broken it because we only begun to rename that property across the codebase - all code thought that artist entries simply didn't have custom display text at all! We fixed this.

We fixed an error we'd made setting up multiple artworks! There were a lot of properties which are more or less coded in the same way - exposing their own value or else inheriting from the artwork, when configured so... and we had accidentally instructed dimensions to be null when an artwork isn't configured to inherit cover artists, rather than dimensions. Since flashes don't have cover artists at all, they were completely ignoring any non-square dimensions. Oops!

lyrics details (top)

We recently added support for switching between multiple lyrics entries for tracks. We've touched this up, inside and out!

Pick your favorite!

Here's the YAML for these two:

Lyrics: |-
    <i>Svix:</i> ([artist's lyrics](https://www.furaffinity.net/view/3453787/))

    Take my hand I'll help you stand
    and then we'll journey through all the things we've planned
    I'll be sure to help you understand
    That I'm going nowhere I'm here for the future

    ...


    <i>[[artist:lilithtreasure|Lilith]], [[artist:quasar-nebula|Lan]]:</i> (wiki lyrics)

    Take my hand I'll help you stand
    And then we'll journey through all the things we've planned
    I'll be sure to help you understand
    That I'm going nowhere I'm here for the future

    ...

And here's how all that's getting used:

  • The <i>Svix:</i> part really does point to the artist Svix, just like for commentary entries, but here in lyrics, it isn't doing anything.
  • The second entry is annotated (wiki lyrics), though, so it does show its artist text - those Lilith and Lan links - in a dandy new "Contributions from" line.
  • The text in the annotation controls the name of the switcher. So like, the first entry extracts artist's lyrics out of the Markdown link [artist's lyrics](...), and the second entry is wiki lyrics, since its annotation is just written that way anyway.
  • The URL in the annotation tells the wiki the source for the lyrics. It's what ends up in that "Via" line, where just the platform name is displayed. (FurAffinity hasn't been added as a recognized platform though... so the wiki just shows furaffinity.net...)

And some asterisks:

  • The list of contributors is only displayed for "wiki editor" lyrics, yes really. We sort of imagine that the stuff that belongs inside <i>...:</i>, for sourced lyrics, are whoever would have been responsible for putting the lyrics on that platform... like, the entity who "represents" those lyrics. Maybe the music label, or the band/project as a whole, or album asset management, or... but right now that's really subjective and up in the air, so we don't display the artists for sourced lyric entries at all, yet.
  • Right now the artist text (the stuff between <i>...:</i>) is just displayed as it is, but it should usually be written as entries separated by commas anyway. We may parse and present it in a more list-y format later on, since that is now a thing we can do.
  • We're implicitly parsing out the artist references from those [[artist:lilithtreasure|Lilith]] links, which means these lyrics entries really do point towards the artists. And this works for commentary entries too, so life is occasionally newly good.

Diff showing a commentary entry which doesn't need to explicitly write out its Cool and New Music Team reference now, since the link is part of the displayed text anyway.

@todo find better example before publishing devblog

Also, if a lyrics entry annotated "wiki lyrics" includes any annotative parts enclosed in square brackets, it now gets a neat new "Mind parts marked in [square brackets]" line at the top:

That new line, in context! The lyrics are mostly complete and certain, but one line has some square brackets, and reads: 'Maybe it's all [not?] a dream'

This track is You Fade Away, by Svix.

soundcloud scraping (top)

We received word that SoundCloud uses user content to train AI, which is a pretty harsh tl;dr for a handful of reasons, but is also the tl;dr which ten thousand or so people are probably acknowledging and distributing at the moment. (See these brand new articles which note the relevant clause in the ToS dates to February 2024, and include a statement from SoundCloud; the articles' recency should indicate that the belated news is making some traction, though.)

We were quickly concerned and alarmed that a lot of material might start disappearing from SoundCloud quickly. And while we've never diagnosed how well or poorly SoundCloud pages are currently archived on the Wayback Machine, we knew it couldn't be thorough enough, and that besides most of the music itself had probably never been mirrored or downloaded at all. (After all, the Wayback Machine generally captures the static content of webpages, not material that streams in only following some user interaction.)

So we laid out a bunch of plans to try to help with both big aspects - making verifiable captures on the Wayback Machine of lots of SoundCloud pages, and downloading tracks too, so that they can be privately or publicly preserved. And then we got started!

The simple outline of our plan is as follows:

  1. Write a program which watches as you, a human being, scroll SoundCloud pages with lots of lazy-loaded elements, and automatically extracts the stuff you can see with your own eyes or fingers - most importantly, URLs.
  2. Decide what scope or scopes we want to consider "interesting" for the music wiki's purposes, because we can't well feed millions of random tracks to the Wwayback Machine or our own storage space (let alone gather those URLs in the first place, and risk missing the ones that do matter to us!).
  3. Use the program to help gather all those URLs, e.g. by scrolling user profile pages top to bottom. This is either manual or automated (like with a macro) if there's just way too many to handle by hand.
  4. Feed all the URLs to the Wayback Machine (using their Google Sheets integration service) to create verifiable captures. Meanwhile, download the same pages locally, so we have them available for quick offline processing.
  5. Through that offline processing, identify which tracks have "real" downloads available today, and which we'll have to just preserve the audio that gets streamed to your browser during online playback. (The former is preferred because they're the original files the artist uploaded; streamed audio is typically compressed and lossy, and lacks the artist's own audio file metadata.)

Since we prepared this plan and project over just this latest weekend (this post is late!), we're basically only past step 1 above. However, short of just "going for it" with step 2, we know what we're doing for the rest! See #album-repository for the latest detailed summary, and please track that channel if you're interested in following how this comes along in real-time.

'Some Artifacts May Occur' (top)

We reviewed this album addition by Lilith.

This album was tricky because practically nothing is known about it. All information scattered about the web apparently sources to the album's inclusion in IAreKyleW00t's complete collection of fanmusic from 2013, but little to nothing is known about that inclusion.

For wiki editor notes on this album, we summarized what info we did find - information about the two existing downloads, the list of names the artist apparently goes by (there are a lot), as well as the sources we reviewed for track names.

We also traced and dated the album description which currently lives, slightly revised, at homestuck.net - it was originally a wiki edit by DarkMarxSoul from 2017. The revision is a bit funny, because while the latest version gives a definitive date of June 23, 2013, the original text is a lot closer to what we were able to find by our own research.

'Cognitive Coalescence' (top)

We reviewed this album addition (a single) by Lilith.

We reviewed the initial art tags here - John and June both, together. But the artwork features only one character.

This was consistent with existing uses of the June tag - before it was introduced in January 2023 (so actually in March 2022 - #artwork-tagging if you want to reread), we decided that the places where June was used also should be accessible by, well, browsing uses of the John tag.

That early conversation was in the context of a new idea of "sub-tags", and pretty directly lead into always tagging CaNWC characters - like Jhon - alongside their "main" character tag - John, again, here.

This all worked, but it was pretty clunky. Around the introduction of networked tags (which happened ages before the feature was released, haha) we ended up removing those CaNWC double-uses; after all, with networked tags, all the John-related tags would be... networked together, somehow!

This did come to pass (eventually), and we now have John (archetype) collecting all such Johns-and-co. But we overlooked June and hadn't also un-doubled uses of that tag, before. We have now!

Apart from tag review, we checked over the reference list for this track and received confirmation over Moonsetter's presence, and tidied some commentary and comments-over-Discord into neater crediting sources.

DELTARUNE newsletter clippings (top)

We reviewed and worked with ruby on her pull request, back from January, adding "rejected" tracks from DELTARUNE Chapter 2.

This was kind of a project, ha. From the outset, these were added - to References Beyond Homestuck - because a handful of the tracks were referenced by tracks which did show up in the soundtrack for Chapter 2 - for example My Castle Town now references "Castle Town (Unfinished)" instead of The Legend. But we noticed that ruby had included all the tracks from the newsletter extra, even the ones which weren't later referenced. On top of that, all tracks had thorough and nicely formatted commentary from Toby. We figured all this would work a lot better as a dedicated album, rather than scattered around RBH.

So after some discussion (which incidentally drove a new view for group galleries), we decided that, sure, these tracks do belong as an air-quotes "album" on the wiki, seeing as toby tumblr sure is here... but so do the rest of the knacks and scraps and miscellaneously released music for deltarune and co. We got started adding these and worked at what the series on the Toby Fox and UNDERTALE & DELTARUNE might look like, then left the rest of the related additions to ruby.

Apart from working out where these tracks even belonged, in review, we also formatted a bunch of commentary text to make use of the recently-added <audio> feature, instead of... a lot of email HTML.

# old
<b><center>Castle town unfinished</b></center>
<center>Castle Town (Unfinished)</center>
<center><audio class="music" controls="" volume="0.7"
<source src="https://toby.fangamer.com/assets/interviews/9th/music/castle_town_unfinished.mp3" type="audio/mp3"
<p>Your browser does not support music playback. Please <html:a href="https://toby.fangamer.com/assets/interviews/9th/music/castle_town_unfinished.mp3">download the file</a> to listen.</p></audio></center>

# new (thank god)
<audio align="center" src="media/misc/archive/power_of_neo_unfinished_wip.mp3"></audio>

This reformatting got us to fix a bug about center-aligned audio not aligning center.

more data stuff (top)

We reviewed and merged ruby's revisions for Nintendo Music week 16, as well as some fixes for Sonic Mania's mid-boss track.

We added the SWEET RAVE PARTY² art collab! This mostly meant adding a bunch of artist information - checking names and adding lots (and lots) of social links. This also indicated to us that non-square flash artworks were broken. We fixed that.

We reviewed Lilith's pull request adding a bunch of releases from Svix, mostly smaller lyrical ones. This instigated all our feature work on lyrics details, and we reviewed, revised, and filled out those details across all some additions here, plus Svix's album Induction added last year.

We had a go rambling about multiple versions in the context of reference relationships, but didn't quite get where we wanted with it. (#references-roundup)

content parsing (top)

Last time we figured out how to internally code content entries as proper wiki data objects. We've since expanded on that work in a handful of fun ways!

Fundamentally, all "content parsing" happens with a very simple function called parseContentNodes (previously parseInput). This function does more or less "nothing special" and can be used anywhere on the backend - its only role is to transform a string, representing a complete stretch of content text, into a roughly flat structure, representing the stuff going on in that string.

Because it's so portable, it's already gotten some use around multiple parts of the codebase:

  • transformContent (content code), which is the big interface between content nodes and actual HTML, and is used everywhere content text appears on the site (it's directly part of 21 other content functions!)
  • reportContentTextErrors (data checks), which mainly detects errors in references written in links like [[artist:alex-roseti]], plus badly formatted URLs in Markdown links

OK well only two places previously, but they are two quite unrelated places. And parseContentNodes gets two new uses, now, too:

  • parseContentEntries (YAML transformations), for the headings of entries, decides if the text between <i>...:</i> is a list of direct artist references or just content text, and - if it includes both, split by a pipe | - where one starts and the other ends
  • contentArtists (data compositions) now conditionally extracts the references from tags like [[artist:quasar-nebula|Lanolin]], and uses that to expose the list of apparent artists on any(!) content entry

These are pretty cool uses on their own merits, but we want to draw attention to the latter one in particular, because like the rest of data compositions, any new stuff is sure to lead to more cool stuff, sooner or later. withContentNodes, like parseContentNodes itself, may be used in any context, and we've got a function-plus-composition splitContentNodesAround that can be used for all sorts of list-y shenanigans.

Particular uses are most certainly TBD, though. LOL

stringification sorrows (top)

In silent response to a request that particular whitespace be preserved in crediting sources, we decided to have a go at reworking the internals of HTML stringification. The aim was ostensibly simple - add a new html.metatag('tightspace') element, which indicates its content should always be aligned with column zero, in the resultant HTML. This way, we would style whatever element we like with white-space: pre-wrap and reflect original whitespace just by bringing it as-is into the page's source text.

Unfortunately... no dice! The work was a good time, but we got a bit lost in the weeds and lost our steam before we worked a working solution out of... the rework. We got some more familiarity with parsing problems as a whole (surprise surprise, still trying to learn in this area) - in particular we feel we were focusing too hard on making a consumable intermediate representation, before figuring out how we were going to consume it. If we'd worked out the problem from a different perspective we figure we might have been able to bypass the IR we ended up with altogether, so we decided to chuck what we had (after backing it up) and come back to this idea another time.

Meanwhile, there are probably other (frankly more robust *lol*) ways to get the sorts of whitespace preservation we're really looking for.

☕️ (top)

From the outset we were angling for a more data-oriented pair of weeks here, but in truth most of the work in this devblog comes from the first week (you know, up through May 5, approximately).

Nevertheless, during that week, we drank coffee. Your coffee. Thank you!

COFFEE COUNT: 3 coffees
YOU (yes you) TIPPED US: all three of them waow

The previous entry was mostly retrospective. We wrote this devblog as we went. It was fun! We're mostly happy with how interconnected everything here feels, which hopefully gives a sense for like... the actual texture of wiki work. Features happen because more data happens, and more data happens because features happen; brand new stuff builds on stuff young and old alike; and then oh shit SoundCloud contemplates catching on fire and it's archival time. Good times all around.

As we sure hope is usual (actually a primary benefit to periodic devblog posts LOL) - the preview website is updated and live, and includes more or less everything we've talked about here! (Except some of the data stuff... Not all of that is merged yet... Soon though...!)

https://preview.hsmusic.wiki
https://ko-fi.com/qzneb
slurp

~ QN