Create a Track with the New Songbirds Lyric Library at MixterPlus
skip
Home » Forums » Features » Samples (torrent?)

Samples (torrent?)

victor
.
permalink   Wed, Apr 30, 2008 @ 1:00 PM
This is just something I’m thinking about and I know there is a lot of religion wrapped up in this so please be gentle and try to be objective in giving feedback:

I want to download ALL samples on the site and organize them in a useful way (?)

I’m writing a script that will download them and sort them by license/bpm

Directory structure:

ccMixterSamples
|- attribution
||- 000-060
||- 060-065
||- 065-070
||-etc.
|- noncommercial
||- 000-060
||- 060-065
||- 065-070
||-etc.

I would put a little ‘meta’ text file for each file which contains info about the file like tags, exact bpm, maybe the description? I can also easily incorporate the ‘nicname’ in the file name so instead of:

Pitx_-_In_July_1.mp3

it would be

Pitx_-_In_July_1_Rhythm_Guitar.mp3

Does this make sense? Maybe break down the thing further by (popular) instrument:
|- attribution
||- 000-060
|||- drums
|||- guitar
|||- keys
|||- horns
|||- multiple
|||- misc
|- 060-065
|||- drums
|||- guitar
etc…

The alternative is to just pound the file names with bpm-instrument but I’m a little worried that the records aren’t consistent enough

090_095_guitar_Pitx_-_In_July_1_Rhythm_Guitar.mp3

Then possibility rex-ifiy/acidize a lot of it - the good ones - and then (?) maybe make a torrent of out it.

thoughts?
shimoda
.
permalink   Wed, Apr 30, 2008 @ 6:33 PM
I think this is an excellent thought, though the undertaking past trhe script for the original download would seem quite the daunting task. Perhaps I’m just new with Recycle, but it does eem like it would take time. I can say, however, how useful I think this would be in really building up mega starter packs of cc material. In my few months here, I’ve collected quite a few remixes and samples, but know I haven’t scratched the surface yet. I just discovered sleepless’s great samples through happenstance, but that kinda thing has taken time as well. A torrent would be great as it could break larger groups up.

Something to consider along with this project, however, is that if the work will be done to make any part of the samples rex’d/acidized, a base description of sounds might be a good additional bit of info. Whatever you decide, I’m good at doing this and that work and would offer any help necessary to achieve this task in the moments here and there between work life, remixing and family time.

Oh yeah, did I mention that I thought it was a good idea?
 
.
permalink   victor Wed, Apr 30, 2008 @ 8:08 PM
very astute, yes I estimate the scripts and dl’ing is about 2/3 days - every thing else about 2/3 months (seriously) - but daunting is something we do here, that’s ok - as long as you think it’s a good idea, which I’ll put you down as a ‘yes’ ;)

I think there are several different packages we’re talking about here, one for the ‘raw’ samples, the ‘as is’ download in the structure proposed above, and then sample packs/spin-offs which is where all the work really is.

something like that…
duckett
.
permalink   Wed, Apr 30, 2008 @ 7:49 PM
Sounds ambitious, but I’m certainly not going to give you a hard time for making an effort to improve how samples are handled on the site… I guess it’s one of those things where I’d have to reserve judgement until it was actually implemented- I haven’t found myself experiencing any real headaches with how it is now, but who knows? Maybe what you’re exploring would make the old system seem clunky and awkward, comparatively… Do what thou must, O Mighty Admin ;-)
 
.
permalink   victor Wed, Apr 30, 2008 @ 8:11 PM
now is not the time reserve anything, this is a request for feedback so folks find it useful and productive.

fwiw, I’m not talking about replacing the current browser, this would be in addition, a mega download with everything organized.
 
.
permalink   duckett Wed, Apr 30, 2008 @ 8:16 PM
Oh. In that case, well, uh… YEAH mane, go ‘head- do it to it!
DJ Rkod
.
permalink   Thu, May 1, 2008 @ 1:53 AM
I think this is a good idea. You know, it’s so nice to have an admin who’s actually here and really cares about the community.

I’d like to put in a request that you organize the samples by file format as well.
John Pazdan
.
permalink   Thu, May 1, 2008 @ 8:03 AM
The rex-ifying..the rex-ifying..one by one..months of work…looking at reCycle..blurry vision from too much purple and blue..begin to hate allnon normailized samples…shitty editing..cut off tails…arghh..must …not…volunt..

then again, why not.
spinmeister
.
permalink   Thu, May 1, 2008 @ 9:29 AM
I’m not sure, if that belongs into this discussion, but a couple of structural artifacts make it harder to find some stuff here. And I’m as guilty as anyone of causing that:

* multiple samples in archive files
* the feature to upload secondary files

These are nice for several reasons:
* it’s easier to upload one zip file than 20 individual files
* using secondary files allows grouping of files that belong together in one way or another, and also reduces the clutter in the “uploads” tab.

However, those samples become difficult to impossible to find and/or preview.

So here’s why I think that issue may possibly related to your proposal:
- as a side question, would your new feature include the samples inside zip archives and the secondary files?

But more importantly, would it be maybe worthwhile addressing this structural issue to make it a bit easier to find ALL ccMixter content before attacking a massive download feature?

If I had my ccMixter dreams come true, there would be a few features discussed and possibly implemented if they seem to make sense:

* split the “Uploads” tab for each artist into the same high level sections as the site has in it’s tabs: “remixes”, “Samples”, “A Capellas”
— I realize, that one can probably do this with the new (and very cool) tag search feature in the artist’s Uploads tab.
— But I can’t for the life of me figure out, why the high level mental/data model of the site “remixes, samples, a cappellas, playlists” would not apply at an artist level as well. It would seem much easier for a user to get to know one mental/data/structural model rather than several.
— This will not only make searching/ finding stuff easier, but it might also the reduce the temptation to upload a sample file representing one or more tracks from a remix as a secondary file.
— It might even encourage more remixers to upload key tracks (e.g. nicely played instrumental parts) from their remixes as individual samples.

* after a zip file is uploaded, extract it from the archive and actually present it as separate uploads.

* (related to the previous one) add the concept of a “song” to the mental/data model of ths meta data (optional to populate, because some samples are not related to a song, some are). I’m not sure if the generic tagging infrastructure would be good enough to squeeze that in (or if it should have a separate column in the meta data table for the uploads) — I haven’t thought about it long enough.

— An uploaded and subsequently split zip archive would all get the same “song” data (if filled in) assigned to it, along with the same bpm etc.

* more mental/data model thoughts:
— samples can be musical performances or sounds; since a cappellas are already split out as a separate thing (rightfully so).
— many remixers cherish good instrumental performances almost as much as good vocal one’s.
— key signatures as a meta data item (I know it’s problematic in some ways, but so are BPM, genre, mood)
— midi files as a specific subset of instrumental performance. With much of the DAW software having rather competent midi capabilities, midi files should be encouraged. It allows me to take a piano performance and have it played back on a Rhodes or Wurli, a harp or more exotic stuff. Midi performances can be re-bpm’d even easier than audio one’s. Midi performances can be transposed even easier than audio one’s.


Ok - I’m really sorry, VS - this entire post may seem horribly off-topic to your question. However, it is honestly the first thing that came to my mind when reading your post.

I think now that ccMixter has this very funky GUI, it may be worthwhile to refine/deepen the mental/data model (mental model for people, data model for the software). To be quite honest I think there is tons of rich musical data here, which isn’t being found easily enough. So the finding of stuff strikes me as a bigger issue than the downloading of stuff. Search arguably is the holy grail for any sample library. And that quickly leads to the need for a good underlying mental/data model.

Maybe a separate thread about the mental/data model of the musical data at ccMixter might be interesting?

Sorry again for the hijack of this thread, but I do think it is related.
 
.
permalink   victor Thu, May 1, 2008 @ 10:06 AM
Quote: But more importantly, would it be maybe worthwhile addressing this structural issue to make it a bit easier to find ALL ccMixter content before attacking a massive download feature?

we can do both (if we have to).

but yea, feel free to start another thread about the site’s mental model (aka ‘user model’ - mapping what the user is thinking into an actionable interface). I have plenty of thoughts, ideas, rationales and regrets to keep that thread alive for a while.

VS
 
.
permalink   spinmeister Thu, May 1, 2008 @ 11:42 AM
both? that would be a dream!

— I’ll start a separate thread

— and I’ll post a separate (on topic, I promise!) response to this thread
spinmeister
.
permalink   Thu, May 1, 2008 @ 12:30 PM
ok - on topic this time :-)

* first of all, I think the idea has considerable merit. And using torrents for huge files would seem to make sense as a bandwidth saving measure. At least for those ISPs who still have a shred of net neutrality decency.

* it would be really nice to include the files that are inside zip archives as well as the uploads that are secondary files to a main upload.

* a separate file containing the meta-data would strike me as much more flexible, than stuff buried in file names. — Can you make it tab delimited, so it can be imported into one’s favorite spreadsheet program and further sorted and manipulated? One could probably even use some of the online one’s, if one doesn’t want to clutter one’s music machine with a spreadsheet program.

— just an idea: what about that meta data file also containing links to the original file and making it available separately?

* it might also be worthwhile to auto-fill as much as possible of the standard mp3 id3 tags, because they do show up in many of the modern file browsers and media management software.

* creating REX2 or acid files sounds like a ton of work, because creating a sensible set of slices does require human musical judgement, no? - I have Recycle 2.1, so maybe I should give that a shot for a few loops/samples.
 
.
permalink   radiotimes Tue, May 6, 2008 @ 9:24 PM
well I take my hat off to you Vic for this idea. As Spin says it seems like a bucket of work but if you can pull it off I think it will greatly enhance the site.

Best of luck!
victor
.
permalink   Wed, May 7, 2008 @ 6:35 PM
So here’s some more data…

I’ve REXified the entire Trifonic Emergence sample pack, ended up with 167 REX files - it took a week, a few hours a day.

There are currently 2,100 uploads here at ccM marked sample, 550 of those with ZIP archives.

Figure (conservatively) 3 samples per ZIP that’s about 3,600 samples.

3,600 samples divided by 167/week = 22.5 man-weeks of sample processing. (If you think 3 samples/ZIP is too low, add 3.3 weeks for every additional sample/ZIP)

Note that I was moving as fast as I could and as a result I guarantee many of the REXs could have been cut better.

Even taking into account that many of the samples here don’t deserve to be touched leave alone cut and processed because of content but also because of low quality and that many of the samples here are already loop-cut and would need no or tiny amount of processing - I think a half a year project is moved beyond “daunting”

I’m going on the road next week for until June. What I propose (not promise) for myself is set aside 2 months to do this and then quit, wherever I am - and see if we have anything worthwhile at that point.

(I’ll be talking with Trifonic about the best way to make the REXs (570MB) available in case anyone is interested)