User interactions with browser

To automate browser I need to list all user interactions with it. I divided them into two sections. First are webpage interactions so keyboard/mouse/focus events. Second are browser window/tabs/address bar interactions.

First let's list webpage interactions cause those (besides selection) are fairy simple.

Browser have three keyboard events

And bunch of mouse events:

Finally focus events:

So those were some basic stuff I want to record to automate browser interactions. When doing so I need to know the place in DOM where interaction occurred and what were the effects of it. Sometimes to get effects I need to listen to the browser itself.

Great is that browser API named WebExtensions is now mostly cross browser compatible. So I can write once deploy everywhere with small modifications. ex. between chrome/opera and firefox it would be at least:

if(typeof browser == "undefined") {
    var browser = chrome;
}

but webExtensions API is not the point of this post. So getting back to browser interactions.

Let's start with webNavigation so I can know what's the current status of the webpage.

Then we got tabs so I can detect if someone want to interact between two applications in different tabs or something happened ex. some popup showed.

The last one are webRequest actions on webpage so we can store data for later usage. Also sometimes we need to delay next browser interaction and make it after the data is loaded so when we are recording webRequest we would know when to wait and when just simply interact with webpage ( hope that make sense). Let's not forget that webpage actions are mostly asynchronous.

Also there are more browser api. I will be also considering those in future as most interesting: contextMenus/cookies/history/runtime/sessions/windows.

But for now I will focus on events I listed above as the foundation of browser automator project.

As You can see it needs a bit of work if You want to listen to browser actions. I also hope that replaying those actions can be done all in javascript.

My focus is to create the browser automator that work locally without any cloud or internet connection. Every action will be stored in localStorage of the extension. I already know that webpage and browser interactions could be recorded into set of user actions and then saved with name. What I will focus next is replaying those interactions and also replaying browser actions.

Important points to consider when doing it is creating pauses between actions to pause recording / player. Allow manual modifications of set of actions and create some sort of universal pseudo description language of those interactions. So stay tuned for some more insights from struggles when building browser automation.