Petter Reinholdtsen

Is the short movie «Empty Socks» from 1927 in the public domain or not?
5th December 2017

Three years ago, a presumed lost animation film, Empty Socks from 1927, was discovered in the Norwegian National Library. At the time it was discovered, it was generally assumed to be copyrighted by The Walt Disney Company, and I blogged about my reasoning to conclude that it would would enter the Norwegian equivalent of the public domain in 2053, based on my understanding of Norwegian Copyright Law. But a few days ago, I came across a blog post claiming the movie was already in the public domain, at least in USA. The reasoning is as follows: The film was released in November or Desember 1927 (sources disagree), and presumably registered its copyright that year. At that time, right holders of movies registered by the copyright office received government protection for there work for 28 years. After 28 years, the copyright had to be renewed if the wanted the government to protect it further. The blog post I found claim such renewal did not happen for this movie, and thus it entered the public domain in 1956. Yet someone claim the copyright was renewed and the movie is still copyright protected. Can anyone help me to figure out which claim is correct? I have not been able to find Empty Socks in Catalog of copyright entries. Ser.3 pt.12-13 v.9-12 1955-1958 Motion Pictures available from the University of Pennsylvania, neither in page 45 for the first half of 1955, nor in page 119 for the second half of 1955. It is of course possible that the renewal entry was left out of the printed catalog by mistake. Is there some way to rule out this possibility? Please help, and update the wikipedia page with your findings.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english, freeculture, opphavsrett, verkidetfri, video.
Metadata proposal for movies on the Internet Archive
28th November 2017

It would be easier to locate the movie you want to watch in the Internet Archive, if the metadata about each movie was more complete and accurate. In the archiving community, a well known saying state that good metadata is a love letter to the future. The metadata in the Internet Archive could use a face lift for the future to love us back. Here is a proposal for a small improvement that would make the metadata more useful today. I've been unable to find any document describing the various standard fields available when uploading videos to the archive, so this proposal is based on my best quess and searching through several of the existing movies.

I have a few use cases in mind. First of all, I would like to be able to count the number of distinct movies in the Internet Archive, without duplicates. I would further like to identify the IMDB title ID of the movies in the Internet Archive, to be able to look up a IMDB title ID and know if I can fetch the video from there and share it with my friends.

Second, I would like the Butter data provider for The Internet archive (available from github), to list as many of the good movies as possible. The plugin currently do a search in the archive with the following parameters:

AND NOT collection:movie_trailers
AND -mediatype:collection
AND format:"Archive BitTorrent"
AND year

Most of the cool movies that fail to show up in Butter do so because the 'year' field is missing. The 'year' field is populated by the year part from the 'date' field, and should be when the movie was released (date or year). Two such examples are Ben Hur from 1905 and Caminandes 2: Gran Dillama from 2013, where the year metadata field is missing.

So, my proposal is simply, for every movie in The Internet Archive where an IMDB title ID exist, please fill in these metadata fields (note, they can be updated also long after the video was uploaded, but as far as I can tell, only by the uploader):
Should be 'movie' for movies.
Should contain 'moviesandfilms'.
The title of the movie, without the publication year.
The data or year the movie was released. This make the movie show up in Butter, as well as make it possible to know the age of the movie and is useful to figure out copyright status.
The director of the movie. This make it easier to know if the correct movie is found in movie databases.
The production company making the movie. Also useful for identifying the correct movie.
Add a link to the IMDB title page, for example like this: <a href="">Movie in IMDB</a>. This make it easier to find duplicates and allow for counting of number of unique movies in the Archive. Other external references, like to TMDB, could be added like this too.

I did consider proposing a Custom field for the IMDB title ID (for example 'imdb_title_url', 'imdb_code' or simply 'imdb', but suspect it will be easier to simply place it in the links free text field.

I created a list of IMDB title IDs for several thousand movies in the Internet Archive, but I also got a list of several thousand movies without such IMDB title ID (and quite a few duplicates). It would be great if this data set could be integrated into the Internet Archive metadata to be available for everyone in the future, but with the current policy of leaving metadata editing to the uploaders, it will take a while before this happen. If you have uploaded movies into the Internet Archive, you can help. Please consider following my proposal above for your movies, to ensure that movie is properly counted. :)

The list is mostly generated using wikidata, which based on Wikipedia articles make it possible to link between IMDB and movies in the Internet Archive. But there are lots of movies without a Wikipedia article, and some movies where only a collection page exist (like for the Caminandes example above, where there are three movies but only one Wikidata entry).

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english, opphavsrett, verkidetfri.
Legal to share more than 3000 movies listed on IMDB?
18th November 2017

A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of movies listed in IMDB that is legal to distribute on the Internet. I have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in a git repository, currently available from github.

So far I have identified 3186 unique IMDB title IDs. To gain better understanding of the structure of the data set, I created a histogram of the year associated with each movie (typically release year). It is interesting to notice where the peaks and dips in the graph are located. I wonder why they are placed there. I suspect World War II caused the dip around 1940, but what caused the peak around 2010?

I've so far identified ten sources for IMDB title IDs for movies in the public domain or with a free license. This is the statistics reported when running 'make stats' in the git repository:

  249 entries (    6 unique) with and   288 without IMDB title ID in free-movies-archive-org-butter.json
 2301 entries (  540 unique) with and     0 without IMDB title ID in free-movies-archive-org-wikidata.json
  830 entries (   29 unique) with and     0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
 2109 entries (  377 unique) with and     0 without IMDB title ID in free-movies-imdb-pd.json
  291 entries (  122 unique) with and     0 without IMDB title ID in free-movies-letterboxd-pd.json
  144 entries (  135 unique) with and     0 without IMDB title ID in free-movies-manual.json
  350 entries (    1 unique) with and   801 without IMDB title ID in free-movies-publicdomainmovies.json
    4 entries (    0 unique) with and   124 without IMDB title ID in free-movies-publicdomainreview.json
  698 entries (  119 unique) with and   118 without IMDB title ID in free-movies-publicdomaintorrents.json
    8 entries (    8 unique) with and   196 without IMDB title ID in free-movies-vodo.json
 3186 unique IMDB title IDs in total

The entries without IMDB title ID are candidates to increase the data set, but might equally well be duplicates of entries already listed with IMDB title ID in one of the other sources, or represent movies that lack a IMDB title ID. I've seen examples of all these situations when peeking at the entries without IMDB title ID. Based on these data sources, the lower bound for movies listed in IMDB that are legal to distribute on the Internet is between 3186 and 4713.

It would be great for improving the accuracy of this measurement, if the various sources added IMDB title ID to their metadata. I have tried to reach the people behind the various sources to ask if they are interested in doing this, without any replies so far. Perhaps you can help me get in touch with the people behind VODO, Public Domain Torrents, Public Domain Movies and Public Domain Review to try to convince them to add more metadata to their movie entries?

Another way you could help is by adding pages to Wikipedia about movies that are legal to distribute on the Internet. If such page exist and include a link to both IMDB and The Internet Archive, the script used to generate free-movies-archive-org-wikidata.json should pick up the mapping as soon as wikidata is updates.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english, opphavsrett, verkidetfri.
Some notes on fault tolerant storage systems
1st November 2017

If you care about how fault tolerant your storage is, you might find these articles and papers interesting. They have formed how I think of when designing a storage system.

Several of these research papers are based on data collected from hundred thousands or millions of disk, and their findings are eye opening. The short story is simply do not implicitly trust RAID or redundant storage systems. Details matter. And unfortunately there are few options on Linux addressing all the identified issues. Both ZFS and Btrfs are doing a fairly good job, but have legal and practical issues on their own. I wonder how cluster file systems like Ceph do in this regard. After all, there is an old saying, you know you have a distributed system when the crash of a computer you have never heard of stops you from getting any work done. The same holds true if fault tolerance do not work.

Just remember, in the end, it do not matter how redundant, or how fault tolerant your storage is, if you do not continuously monitor its status to detect and replace failed disks.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english, raid, sysadmin.
Web services for writing academic LaTeX papers as a team
31st October 2017

I was surprised today to learn that a friend in academia did not know there are easily available web services available for writing LaTeX documents as a team. I thought it was common knowledge, but to make sure at least my readers are aware of it, I would like to mention these useful services for writing LaTeX documents. Some of them even provide a WYSIWYG editor to ease writing even further.

There are two commercial services available, ShareLaTeX and Overleaf. They are very easy to use. Just start a new document, select which publisher to write for (ie which LaTeX style to use), and start writing. Note, these two have announced their intention to join forces, so soon it will only be one joint service. I've used both for different documents, and they work just fine. While ShareLaTeX is free software, while the latter is not. According to a announcement from Overleaf, they plan to keep the ShareLaTeX code base maintained as free software.

But these two are not the only alternatives. Fidus Writer is another free software solution with the source available on github. I have not used it myself. Several others can be found on the nice alterntiveTo web service.

If you like Google Docs or Etherpad, but would like to write documents in LaTeX, you should check out these services. You can even host your own, if you want to. :)

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english.
Locating IMDB IDs of movies in the Internet Archive using Wikidata
25th October 2017

Recently, I needed to automatically check the copyright status of a set of The Internet Movie database (IMDB) entries, to figure out which one of the movies they refer to can be freely distributed on the Internet. This proved to be harder than it sounds. IMDB for sure list movies without any copyright protection, where the copyright protection has expired or where the movie is lisenced using a permissive license like one from Creative Commons. These are mixed with copyright protected movies, and there seem to be no way to separate these classes of movies using the information in IMDB.

First I tried to look up entries manually in IMDB, Wikipedia and The Internet Archive, to get a feel how to do this. It is hard to know for sure using these sources, but it should be possible to be reasonable confident a movie is "out of copyright" with a few hours work per movie. As I needed to check almost 20,000 entries, this approach was not sustainable. I simply can not work around the clock for about 6 years to check this data set.

I asked the people behind The Internet Archive if they could introduce a new metadata field in their metadata XML for IMDB ID, but was told that they leave it completely to the uploaders to update the metadata. Some of the metadata entries had IMDB links in the description, but I found no way to download all metadata files in bulk to locate those ones and put that approach aside.

In the process I noticed several Wikipedia articles about movies had links to both IMDB and The Internet Archive, and it occured to me that I could use the Wikipedia RDF data set to locate entries with both, to at least get a lower bound on the number of movies on The Internet Archive with a IMDB ID. This is useful based on the assumption that movies distributed by The Internet Archive can be legally distributed on the Internet. With some help from the RDF community (thank you DanC), I was able to come up with this query to pass to the SPARQL interface on Wikidata:

SELECT ?work ?imdb ?ia ?when ?label
  ?work wdt:P31/wdt:P279* wd:Q11424.
  ?work wdt:P345 ?imdb.
  ?work wdt:P724 ?ia.
        ?work wdt:P577 ?when.
        ?work rdfs:label ?label.
        FILTER(LANG(?label) = "en").

If I understand the query right, for every film entry anywhere in Wikpedia, it will return the IMDB ID and The Internet Archive ID, and when the movie was released and its English title, if either or both of the latter two are available. At the moment the result set contain 2338 entries. Of course, it depend on volunteers including both correct IMDB and The Internet Archive IDs in the wikipedia articles for the movie. It should be noted that the result will include duplicates if the movie have entries in several languages. There are some bogus entries, either because The Internet Archive ID contain a typo or because the movie is not available from The Internet Archive. I did not verify the IMDB IDs, as I am unsure how to do that automatically.

I wrote a small python script to extract the data set from Wikidata and check if the XML metadata for the movie is available from The Internet Archive, and after around 1.5 hour it produced a list of 2097 free movies and their IMDB ID. In total, 171 entries in Wikidata lack the refered Internet Archive entry. I assume the 70 "disappearing" entries (ie 2338-2097-171) are duplicate entries.

This is not too bad, given that The Internet Archive report to contain 5331 feature films at the moment, but it also mean more than 3000 movies are missing on Wikipedia or are missing the pair of references on Wikipedia.

I was curious about the distribution by release year, and made a little graph to show how the amount of free movies is spread over the years:

I expect the relative distribution of the remaining 3000 movies to be similar.

If you want to help, and want to ensure Wikipedia can be used to cross reference The Internet Archive and The Internet Movie Database, please make sure entries like this are listed under the "External links" heading on the Wikipedia article for the movie:

* {{Internet Archive film|id=FightingLady}}
* {{IMDb title|id=0036823|title=The Fighting Lady}}

Please verify the links on the final page, to make sure you did not introduce a typo.

Here is the complete list, if you want to correct the 171 identified Wikipedia entries with broken links to The Internet Archive: Q1140317, Q458656, Q458656, Q470560, Q743340, Q822580, Q480696, Q128761, Q1307059, Q1335091, Q1537166, Q1438334, Q1479751, Q1497200, Q1498122, Q865973, Q834269, Q841781, Q841781, Q1548193, Q499031, Q1564769, Q1585239, Q1585569, Q1624236, Q4796595, Q4853469, Q4873046, Q915016, Q4660396, Q4677708, Q4738449, Q4756096, Q4766785, Q880357, Q882066, Q882066, Q204191, Q204191, Q1194170, Q940014, Q946863, Q172837, Q573077, Q1219005, Q1219599, Q1643798, Q1656352, Q1659549, Q1660007, Q1698154, Q1737980, Q1877284, Q1199354, Q1199354, Q1199451, Q1211871, Q1212179, Q1238382, Q4906454, Q320219, Q1148649, Q645094, Q5050350, Q5166548, Q2677926, Q2698139, Q2707305, Q2740725, Q2024780, Q2117418, Q2138984, Q1127992, Q1058087, Q1070484, Q1080080, Q1090813, Q1251918, Q1254110, Q1257070, Q1257079, Q1197410, Q1198423, Q706951, Q723239, Q2079261, Q1171364, Q617858, Q5166611, Q5166611, Q324513, Q374172, Q7533269, Q970386, Q976849, Q7458614, Q5347416, Q5460005, Q5463392, Q3038555, Q5288458, Q2346516, Q5183645, Q5185497, Q5216127, Q5223127, Q5261159, Q1300759, Q5521241, Q7733434, Q7736264, Q7737032, Q7882671, Q7719427, Q7719444, Q7722575, Q2629763, Q2640346, Q2649671, Q7703851, Q7747041, Q6544949, Q6672759, Q2445896, Q12124891, Q3127044, Q2511262, Q2517672, Q2543165, Q426628, Q426628, Q12126890, Q13359969, Q13359969, Q2294295, Q2294295, Q2559509, Q2559912, Q7760469, Q6703974, Q4744, Q7766962, Q7768516, Q7769205, Q7769988, Q2946945, Q3212086, Q3212086, Q18218448, Q18218448, Q18218448, Q6909175, Q7405709, Q7416149, Q7239952, Q7317332, Q7783674, Q7783704, Q7857590, Q3372526, Q3372642, Q3372816, Q3372909, Q7959649, Q7977485, Q7992684, Q3817966, Q3821852, Q3420907, Q3429733, Q774474

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english, opphavsrett, verkidetfri.
A one-way wall on the border?
14th October 2017

I find it fascinating how many of the people being locked inside the proposed border wall between USA and Mexico support the idea. The proposal to keep Mexicans out reminds me of the propaganda twist from the East Germany government calling the wall the “Antifascist Bulwark” after erecting the Berlin Wall, claiming that the wall was erected to keep enemies from creeping into East Germany, while it was obvious to the people locked inside it that it was erected to keep the people from escaping.

Do the people in USA supporting this wall really believe it is a one way wall, only keeping people on the outside from getting in, while not keeping people in the inside from getting out?

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: english.
Generating 3D prints in Debian using Cura and Slic3r(-prusa)
9th October 2017

At my nearby maker space, Sonen, I heard the story that it was easier to generate gcode files for theyr 3D printers (Ultimake 2+) on Windows and MacOS X than Linux, because the software involved had to be manually compiled and set up on Linux while premade packages worked out of the box on Windows and MacOS X. I found this annoying, as the software involved, Cura, is free software and should be trivial to get up and running on Linux if someone took the time to package it for the relevant distributions. I even found a request for adding into Debian from 2013, which had seem some activity over the years but never resulted in the software showing up in Debian. So a few days ago I offered my help to try to improve the situation.

Now I am very happy to see that all the packages required by a working Cura in Debian are uploaded into Debian and waiting in the NEW queue for the ftpmasters to have a look. You can track the progress on the status page for the 3D printer team.

The uploaded packages are a bit behind upstream, and was uploaded now to get slots in the NEW queue while we work up updating the packages to the latest upstream version.

On a related note, two competitors for Cura, which I found harder to use and was unable to configure correctly for Ultimaker 2+ in the short time I spent on it, are already in Debian. If you are looking for 3D printer "slicers" and want something already available in Debian, check out slic3r and slic3r-prusa. The latter is a fork of the former.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: 3d-printer, debian, english.
Mangler du en skrue, eller har du en skrue løs?
4th October 2017
Når jeg holder på med ulike prosjekter, så trenger jeg stadig ulike skruer. Det siste prosjektet jeg holder på med er å lage en boks til en HDMI-touch-skjerm som skal brukes med Raspberry Pi. Boksen settes sammen med skruer og bolter, og jeg har vært i tvil om hvor jeg kan få tak i de riktige skruene. Clas Ohlson og Jernia i nærheten har sjelden hatt det jeg trenger. Men her om dagen fikk jeg et fantastisk tips for oss som bor i Oslo. Zachariassen Jernvare AS i Hegermannsgate 23A på Torshov har et fantastisk utvalg, og åpent mellom 09:00 og 17:00. De selger skruer, muttere, bolter, skiver etc i løs vekt, og så langt har jeg fått alt jeg har lett etter. De har i tillegg det meste av annen jernvare, som verktøy, lamper, ledninger, etc. Jeg håper de har nok kunder til å holde det gående lenge, da dette er en butikk jeg kommer til å besøke ofte. Butikken er et funn å ha i nabolaget for oss som liker å bygge litt selv. :)

Som vanlig, hvis du bruker Bitcoin og ønsker å vise din støtte til det jeg driver med, setter jeg pris på om du sender Bitcoin-donasjoner til min adresse 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Tags: norsk.
Visualizing GSM radio chatter using gr-gsm and Hopglass
29th September 2017

Every mobile phone announce its existence over radio to the nearby mobile cell towers. And this radio chatter is available for anyone with a radio receiver capable of receiving them. Details about the mobile phones with very good accuracy is of course collected by the phone companies, but this is not the topic of this blog post. The mobile phone radio chatter make it possible to figure out when a cell phone is nearby, as it include the SIM card ID (IMSI). By paying attention over time, one can see when a phone arrive and when it leave an area. I believe it would be nice to make this information more available to the general public, to make more people aware of how their phones are announcing their whereabouts to anyone that care to listen.

I am very happy to report that we managed to get something visualizing this information up and running for Oslo Skaperfestival 2017 (Oslo Makers Festival) taking place today and tomorrow at Deichmanske library. The solution is based on the simple recipe for listening to GSM chatter I posted a few days ago, and will show up at the stand of Åpen Sone from the Computer Science department of the University of Oslo. The presentation will show the nearby mobile phones (aka IMSIs) as dots in a web browser graph, with lines to the dot representing mobile base station it is talking to. It was working in the lab yesterday, and was moved into place this morning.

We set up a fairly powerful desktop machine using Debian Buster/Testing with several (five, I believe) RTL2838 DVB-T receivers connected and visualize the visible cell phone towers using an English version of Hopglass. A fairly powerfull machine is needed as the grgsm_livemon_headless processes from gr-gsm converting the radio signal to data packages is quite CPU intensive.

The frequencies to listen to, are identified using a slightly patched scan-and-livemon (to set the --args values for each receiver), and the Hopglass data is generated using the patches in my meshviewer-output branch. For some reason we could not get more than four SDRs working. There is also a geographical map trying to show the location of the base stations, but I believe their coordinates are hardcoded to some random location in Germany, I believe. The code should be replaced with code to look up location in a text file, a sqlite database or one of the online databases mentioned in the github issue for the topic.

If this sound interesting, visit the stand at the festival!

Tags: debian, english, personvern, surveillance.

RSS feed

Created by Chronicle v4.6