Browser Extension with Notion database Integration — Building ArXiv Enhanced (1)

ArXiv Enhanced is a browser extension which can be used to add comments, tags and other details for an arXiv paper and store them in a notion database. This article covers tips on building the chrome plugin, building with webpack, and touches upon the notion API. In a later article I will discuss other features of the extension — adding notes to pdf, highlighting and OAuth.

Notion database showing the various columns

TL;DR
Implementing a Chrome plugin that connects with a Notion database

1. Problem

One of my current roles is deep learning researcher. Deep Learning (DL) being a fast-paced research field, to keep abreast of the advances in DL/AI, I start most of my days scanning arXiv for innovations (in sound and vision topics). To keep up with the glut of research going on, for a long time I maintained a google sheet with headings: [title, paper link, what’s novel, read again]. A few months ago, I started using Notion and instantly fell in love with the tool.

Tracking read papers, highlighting important bits, adding notes got a bit time consuming. I explored if I could automate parts of it — and decided to write a chrome plugin.

MVP Requirements

  1. Add comments, tags, and a checkbox (completed reading)
  2. Only submit API call if the active tab is https://arxiv.org/pdf/*.pdf
  3. If there is already an entry, populate the extension popup with the previously filled details
  4. Scrape https://arxiv.org/abs/* for the paper title

2. Integration with Notion

Notion is quite handy to manage projects, personal life, to-do lists, writing articles, etc. (and in a collaborative manner). For this article, I will assume a private database (table), however it can also shared with other accounts or make public which allows multiple people to track and use the database.

Step 1: Setting up and sharing the database
Notion allows app access using the Notion API. Here we want to set up a database (table) that records the arXiv paper details. (a) Create an integration named “ArXiv Enhanced Database” in my-integrations. (b) Next, create a full-page table with the necessary columns; which represents the database. The ID of the database is in the URL: https://notion.so/workspace/<database_id>. Share the database (full-page table) with the integration using the Share button. Check out [1] for more a more detailed explanation in creating an integration.

Step 2: Notion javascript Client

The main concepts of a browser extension are background script, content script, UI elements and message passing.

  • Background scripts respond to browser events and perform certain actions (for example fetching data).
  • “Content scripts are files that run in the context of web pages”. Details of the web pages like changes that are made, etc. can be accessed in them.
  • UI elements include a browser popup page (when the extension is clicked) and options page.
  • Since content scripts run within the context of a web page, they require a mode of communication with the rest of the plugin. This is message passing

For the current plugin, we require a popup which has 2 input fields (comments and tags) and a checkbox to indicate whether the paper has been read. I use React and Chakra UI for this.

Scraping the title of the arXiv paper:
For a pdf URL in arXiv, there’s a abs URL which contains the abstract and title. We can fetch the title as follows in the background script:

Message Passing between background script and popup:
We will use a long-lived connection to communicate between the background script and popup. In the background script, we add a listener for onMessage function. On receiving a message from the popup (get-current-url), the background script gets the URL of the current tab and fetches the title of the current arXiv paper.

We can build the previously discussed features using vanilla javascript as the extension expects, but we lose the convenience of using packages. Here, we used quite a few libraries (and React); we need to build it before using the extension on Chrome or other browsers.

In this process I discovered Parcel which I think is awesome. They also have a web-extension config which currently works only for Manifest V2 (for V3 it is under development: issue #6079). For this case, we can separately build the entries: popup.html, popup.js, and background.js.

An alternative is to use webpack. Looking around I found this super useful boilerplate repository: chrome-extension-boilerplate-react, which works like a charm.

  • At the moment, I used secret_key method for notion authentication by storing the secret_key in localstorage. However, this is not safe. In the next article, I will look at using notion’s OAuth for authorization
  • Text highlighting and addition of notes to the pdf (and store it in the database)
  • Creating arXiv links for the Reference papers

References

  1. https://developers.notion.com/docs/getting-started
  2. GitHub repo: https://github.com/akashrajkn/arxiv-enhanced

Artificial Intelligence Researcher | Developer