Share your JupyterLite Examples

Symbioquine · February 12, 2022, 7:05pm

I’m creating this topic as a place for folks to share examples and useful snippets related to using JupyterLite with farmOS. (Possibly via the JupyterLite Drupal Module.)

Ideally, each example/snippet should include a screenshot and a copy of the notebook. (A GitHub Gist works well for the latter since GitHub will format the notebook nicely.)

Symbioquine · February 12, 2022, 7:07pm

Seeding Logs by Weekday

gist.github.com

https://gist.github.com/symbioquine/1ad8708ddf83afb3e907b95002d9e744

seeding_logs_by_weekday.ipynb

{
  "metadata": {
    "language_info": {
      "codemirror_mode": {
        "name": "python",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",

This file has been truncated. show original

Animal CSV Import

gist.github.com

https://gist.github.com/symbioquine/7641a2ab258726347ec937e8ea02a167

animal_csv_import.ipynb

{
  "metadata": {
    "language_info": {
      "codemirror_mode": {
        "name": "python",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",

This file has been truncated. show original

animals.csv

animal_name,animal_dob,animal_sex
alice,2021/01/18,F
bob,2021/03/12,M
curt,2020/05/01,M
dolly,2021/06/08,F

mstenta · February 12, 2022, 7:21pm

Great topic idea @Symbioquine! This is exciting stuff!

Attn: @walt @Farmer-Ed @pat @JustGav @gbathree

Cross-reference: CSV Importers in v2.x

pat · February 12, 2022, 8:49pm

Nice. Gonna have to dig into this. (This too)

Farmer-Ed · February 12, 2022, 9:00pm

Cheers @mstenta I’ve installed the module, just have to go and figure out what it can do for me.

mstenta · February 12, 2022, 9:25pm

This YouTube video might help (if you didn’t see it already in the CSV thread, or for newcomers who find this comment)…

Note: @donblair mentions in the video that farmOS.py doesn’t work, but @paul121 figured out a way to do it… See this Gist.

Symbioquine · February 15, 2022, 4:54pm

Land KML Import

gist.github.com

https://gist.github.com/symbioquine/fd6a6f8ea55e4d3fcfea1afc23a31178

land_kml_import.ipynb

{
  "metadata": {
    "language_info": {
      "codemirror_mode": {
        "name": "python",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",

This file has been truncated. show original

The data is the cb_2018_us_state_20m.kml file from https://www2.census.gov/geo/tiger/GENZ2018/kml/cb_2018_us_state_20m.zip

walt · February 19, 2022, 2:36pm

Keen as i am to see this JupyterLite API interface coming to an instance near me (i.e. myinstance.farmos.net/jupyterlite -right @mstenta ? ), i have in the meantime been exploring this alternative by Google called “Colaboratory” - first awkward experiments in this here .ipynb notebook.

AFAICT, this is essentially a Jupyter Notebook running in the browser (i.e. zero-installation on desktop), but powered by Google’s analytics engine in the cloud, so it’s like the JupyterLite example @donblair shared in that respect. All those DataSci py libes popular in DS/ML circles are there for the 'import' -e.g. ‘requests’ @paul121 -and access to the farmOS.py library is as easy as '!pip install farmOS==1.0.0b3', as you can see in the example linked above. It has some value-add features in the interface like a TOC outliner, and a very cool “Interactive Data Table” extension, which wraps filter/ sort/ pagination controls around the pandas dataframe. And of course (this being Google), they make it easy to share notebooks privately to anyone with a gmail, save it to gDrive, Dropbox, Gitub, gist, etc.

All that being said: in the interest of using FOSS whenever possible, i would much rather have my farmOS API interface in Jupyterlite, even without those extra Google features. Just so long as i can run farmOS.py in it- including that 'requests' library, which is i gather essential -then i will happily to adopt JupyterLite as my API interface, just as soon as it lands in Farmier.

mstenta · February 19, 2022, 2:55pm

@walt You should be able to do whatever you need already! You can use the JupyterLite hosted on EdgeCollective’s (@donblair’s) GitHub Pages: https://edgecollective.io/jupyterlite/

The only thing that’s necessary to get that to work with Farmier-hosted instances is we need to “bless” (via CORS config) the edgecollective.io domain in your instance so that your browser “trusts” the third-party URL. I did that already for yours, so you should be good to go!

(If anyone else is using Farmier and wants to play around with https://edgecollective.io/jupyterlite/ ping me!)

Ultimately, it will be great to include the new Drupal JupyterLite module that @Symbioquine and I started on Farmier - but that’s not strictly necessary. The only benefit is you don’t need to configure CORS because it’s serving JupyterLite from the same domain, so your browser will automatically “trust” it.

@walt Just be aware of the way “file storage” works with JupyterLite, so you don’t lose anything! All files you upload or create in JupyterLite are stored IN YOUR BROWSER SESSION … they are NOT stored on the farmOS server. If you clear your browser cache or change browsers, they are gone!

We have some fun ideas to store/serve files from farmOS itself being discussed here:

walt · February 19, 2022, 3:23pm

Thanks, @mstenta -since you configured my instance to accept API calls from that URL- i did have a serious play with this. Works just fine for API access- nice! But when i upload a .CSV to that same directory in which the .ipynb resides, and try to do anything with it, it fails to find the file. Have tried every way i could think to get around this, and consulted @donblair about the problem… No joy

Good cautionary note, thanks -but not a problem at this point, as my browser cannot lose what it cannot even find in the first place! In fact the browser finds & uploads the file just fine; it’s jLite that can’t seem to find it.

Heh: i’ll bet that “some [confused] users” Don mentioned in first line of his problem statement was inspired by yours truly <8-)…

Symbioquine · February 19, 2022, 4:22pm

For now, you need this bit of magic that I’ve included in some of my examples above;

# From https://gist.github.com/bollwyvl/132aaff5cdb2c35ee1f75aed83e87eeb
async def get_contents(path):
    """use the IndexedDB API to acess JupyterLite's in-browser (for now) storage
    
    for documentation purposes, the full names of the JS API objects are used.
    
    see https://developer.mozilla.org/en-US/docs/Web/API/IDBRequest
    """
    import js, asyncio

    DB_NAME = "JupyterLite Storage"

    # we only ever expect one result, either an error _or_ success
    queue = asyncio.Queue(1)
    
    IDBOpenDBRequest = js.self.indexedDB.open(DB_NAME)
    IDBOpenDBRequest.onsuccess = IDBOpenDBRequest.onerror = queue.put_nowait
    
    await queue.get()
    
    if IDBOpenDBRequest.result is None:
        return None
        
    IDBTransaction = IDBOpenDBRequest.result.transaction("files", "readonly")
    IDBObjectStore = IDBTransaction.objectStore("files")
    IDBRequest = IDBObjectStore.get(path, "key")
    IDBRequest.onsuccess = IDBRequest.onerror = queue.put_nowait
    
    await queue.get()
    
    return IDBRequest.result.to_py() if IDBRequest.result else None

If you put that in the top cell of your notebook (and run it first), then you can access the contents of an “uploaded” (to the browser storage) file named “my_file.csv” (as a str object) with;

csv_str = (await get_contents("my_file.csv"))["content"]

Or if you need a file object you can wrap that in io.StringIO. e.g.

file_obj = io.StringIO((await get_contents("my_file.csv"))["content"])

mstenta · February 19, 2022, 4:34pm

Oh thanks @Symbioquine! I wasn’t aware of that issue. @walt hopefully that helps! The same issue would have been present on a Farmier-hosted JupyterLite, in either case.

Symbioquine · February 19, 2022, 4:42pm

Yeah, I think that’s one of the rough edges that will get resolved as JupyterLite moves out of alpha. Maybe as part of jupyterlite/jupyterlite#315.

walt · February 19, 2022, 4:49pm

Thanks @Symbioquine, but… I did put the long magic spell atop the notebook, have tried both of the calls you suggested below it, and neither seems to work. You can see from the below screenshot that my CROPS.csv for upload is in the same directory as the .ipynb… And you can see the error msg when i try the first method. Can you make any sense of this?

Symbioquine · February 19, 2022, 4:54pm

I think I see two problems;

If the cell at the top of the screenshot is the “long magic spell” you copied, I don’t think you got the whole thing. The last line should start with “return IDBRequest”
I think you need to remove the quotes from df=pd.read_csv("csv_str"). I think it should be df=pd.read_csv(csv_str).

walt · February 19, 2022, 5:09pm

Thanks @Symbioquine ; fixed the two errors you cite, but i still get the same error- TypeError: 'NoneType' object is not subscriptable -on line 2, where the “csv_str” variable is defined (not on line 4, where it is invoked -now w/o the quotes).

Got any more insight into what that error msg might actually refer to? (could be many things, according to google, that would trigger this “most common exception in python”)

walt · February 19, 2022, 6:51pm

PS: problem solved, using this method, following that longer script above in the previous block:

import io
file_obj = io.StringIO((await get_contents("myfilename.csv"))["content"])
df=pd.read_csv(file_obj)

note to fellow n00bs: you can’t just refer to yourfilename.csv in the script, even tho it’s in the same directory as the .ipynb ; you should grab the path via right-click on the file in Jupyterlite file browser, because it needs an absolute reference (was nested in the farmos/ directory, in this case).

Thanks a heap, @Symbioquine , for all the help it took to find my mistake!

walt · February 20, 2022, 11:40am

In the course of my stumbling around yesterday, related to this CSV upload challenge, i came to a deeper realisation of what you wrote yesterday @mstenta, i.e.:

Just be aware of the way “file storage” works with JupyterLite, so you don’t lose anything! All files you upload or create in JupyterLite are stored IN YOUR BROWSER SESSION … they are NOT stored on the farmOS server. If you clear your browser cache or change browsers, they are gone!

Yes; in fact, to debug my problem, i had to switch from Chrome to Firefox, and of course the subject file had to be re-uploaded… Which put me to wonder about a coupla things, in context of this UseCase:

A primary benefit of this application architecture is having the freedom to work from any machine- e.g. farm office and/or home -but, given the diffs that will inevitably arise in state across those two machines, how can we mitigate the confusion that will consequently arise?
Memory management: This little (26kb) CSV is small enough to be of no concern… But this being a workflow we plan to run at least weekly- and sometimes on much larger files -what might be the negative impact on browser/system performance, and what should be done to mitigate that problem?
Given these (and other?) limitations, a JupyterLite NB that references files is not suitable for sharing a replicable result -whether in the interest of tech support (as @Symbioquine and i experienced yesterday) or in the larger context of Replicable Data Science.

Obviously i don’t understand this technology enough… So i did a little digging yesterday, from which i gathered that localStorage is a subtype of Web Storage (f.k.a. DOM Storage), along with two other forms that are more familiar to me. What complicates matters further is the different ways in which browser-makers implement the standard (that’s the point at which my head started to hurt, so i quit digging), but i did find this little table (pictured below) that helped me to understand essential similarities & diffs.

Bottom-line: There’s enough deep voodoo about this stuff that- to avoid sliding into even deeper doodoo! -i think it will be wise to store any files referenced in the JupyterLite NB in an online archive, and link them explicitly in the document.

Screenshot 2022-02-20 at 11.01.30

Symbioquine · February 20, 2022, 3:26pm

Hey @walt this article is a little better because it’s a little more up to date and includes IndexedDB, which is the storage API which JupyterLite will use in most browsers.

image700×345 66.5 KB

A more up-to-date table from that article.

Otherwise, I think you’re making some great points - most of which don’t have concrete answers.

I will say though that many of those issues are mitigated just by changing how we think of the “storage” in JupyterLite. If we consider the storage in JupyterLite like a sandbox environment or temporary work area, then we can treat those things as advantages.

I would argue that it is better for both the tech support and replicable data science scenarios.

I was able to start with a clean slate and bring in just the files I needed to try and reproduce the problem you were having.
I was able to modify the files without any risk losing data or breaking things for anyone else.
It forces me to be intentional about sharing just the versions/changes which are important.
- Conversely, it helps ensure the collective workspace isn’t increasingly littered with semi-relevant experiments.

I would also argue that one of the foundations of truly replicable data science is going to be consistent and disciplined use of version control technologies like Git to manage the source code - and in some cases sample data. Perhaps in the future JupyterLite could help with that part, but in a way it’s kind of beautiful that it doesn’t. It’s job is just to be a place that’s reproducible (but not replicated) between users to run some scripts in a little more interactive environment than a text editor and a command line.

walt · February 20, 2022, 3:59pm

Thanks @Symbioquine for providing a more nuanced perspective. I can see how what i was inclined to view as bugs might be considered features… Just so long as we (a) treat it as a “sandbox,” and (b) employ that “more consistent and disciplined use of version control tech” for both sources and sample data.

Also: since you had me open Developer Tools yesterday (in search of that CROPS.csv file), and as that article you linked explains in more detail (illustrated by screenshot below), i can now easily navigate to where these files are stored (IndexedDB indeed, in both Chrome and Firefox), confirm the keys and drill down into values… But only by navigating the JSON tree, in which form my nice tidy tables has been rendered.

One day i hope to get over my allergy to these so-deeply-nested JSON trees; still, being more of a rows&columns kinda guy, i have to ask: is there any easy way to translate this JSON tree back into tabular .CSV form?

Topic		Replies	Views
farmOS Demo Data Development	2	472	June 9, 2022
Troubleshooting JSON:API / JupyterLite Hosting farmOS	7	299	March 26, 2022
Development Ideas? Jobs Board	3	62	December 16, 2024
Example Data for FarmOS Using farmOS	31	3265	June 8, 2022
farmOS Monthly Call, 10 April 2024 Community community-call	2	116	April 11, 2024

Share your JupyterLite Examples

Seeding Logs by Weekday

Animal CSV Import

Land KML Import

Related topics