{August 1, 2011}   Beyond Activities: Cross-Device Sessions

Update: I’ve scheduled the BoF session for thursday, 14:00 in room 1.404/1. I’ll pop into the systemd bof on wednesday, too.

Update 2: It turns out systemd is solving an entirely different problem. :) there’s no overlap between it and sessions-as-in-restoring-windows.

I’ve been trying to write this blog post for months, and always hitting writers’ block or other distractions. So, screw it – I’m just going to start writing and see what comes out. :)

What this is about, is sessions, and XSMP, and wayland. Activities use XSMP, the X Session Management Protocol, to save and restore groups of windows. Before that, it was only used for the login session. It’s actually a better protocol than people give it credit for, and it served well for a few decades – but times have changed. If we want to move forward and do awesome things like sharing sessions between devices, we need something new. Even Activities as they are now push the limits of XSMP – there are a few ugly little hacks hiding in there that I’d prefer not to have. :)

Now, the key here is, if we’re going to replace an ancient-but-reliable technology with something new, we want the new one to be not just better but a lot better. Something worth switching to. Small issues like XSMP’s lack of autosave will be easy to correct; the big one is that it’s process-based. Session keys are handed out per-process, and when things get restored, they’re done by calling that particular binary with the session key as an argument. This is the source of two major pains; the first is that when a process has multiple windows (like konsole), you can’t properly put those windows in different sessions – so Activities have to do ugly things to hide this. The second, and larger problem, is that it’s ridiculously non-portable and opaque. What if the program’s binary gets moved elsewhere? what if the user switches to a competing app? And of course, their phone is not going to have the same programs installed as their laptop! Even if they have a meego phone and use Calligra on all their devices, the binary for the phone will be named differently. :P

So, what do we do? We make it resource-based! :) Store data in a standard place and standard format, so that the session manager, instead of seeing “this session had /usr/bin/firefox, /usr/bin/konsole and /usr/bin/okular” sees “this session has web pages X Y and Z, which this device last used firefox to open, with X and Y in the same window; two terminals, which this device last used konsole for; and foo.pdf on page 3, which this device last used okular for”. Then if okular isn’t available it can ask the system for a program that handles pdf, and when the user sends the session to their phone, it doesn’t matter that there’s a different pdf reader, or even that the phone browser doesn’t support tabs. Not only that, but if there were 20 urls open instead of three, the phone could go “whoa, that’s a bit much; I’ll just open the first three and display links to the rest somewhere.” And I could go “hmm, when the heck did I create an activity called ‘shambles’? wtf is in it?” and get an answer without actually having to load it. ;)

Another side of this is, well, the small stuff does add up. I just heard that OSX gained some decent session support; they’ve taken lessons from the iPhone and applied them to the desktop, making a system where it doesn’t matter if an app crashes – the OS may even kill it whenever it pleases – the session support is so good it’ll come back just as it was. I’d love to see that sort of solid session support in linux; KDE apps with XSMP are pretty good right now, but they could be better (and an autosaving protocol would go a long way towards fixing that).

Now, replacing XSMP isn’t an entirely new idea. Gnome people have been sick of XSMP for quite a while, although I don’t think anyone stepped forward with an alternative (and for their purposes, it does work, anyways). When I talked to the Nepomuk guys many months ago, about cross-device sessions, they had already thought of it – but decided the political side of it was too hard. They might still be right – we’ll see. But it’s worth a try. :)

So, what is the hard part, you ask? It’s the reason I stuck with XSMP for activities: a new protocol means persuading apps to support said protocol. XSMP’s support, while patchy in places, is at least decently widespread and un-controversial. A new protocol… well, I’m fairly sure KDE would embrace it, but I want it to be a part of gtk apps too, and pure qt apps, and meego apps, and even the weird proprietary fringe ones (like maple) someday. That is a challenge. :)

So how do we meet this challenge? First, we explain the benefits it brings – I hope I’ve done that well enough above. Second, we offer app developers a well-designed, easy-to-use API that’s had some feedback from actual app developers; just the other day I saw someone on IRC complaining about XSMP being confusing. :) Third, once we’ve got enough support to know this is worth trying, we show them a solid implementation that works (without any fatal bugs). Seeing is believing, right? Nobody wants to spend precious time writing code for a system before it exists, so we get the minimal feature set done, port a couple of apps, and show it off. Fourth – the secret weapon. ;) Wayland is gaining popularity, and when apps start porting to it, they may as well port to this too, right? Wayland doesn’t have any session protocol of its own, so this could be it. :)

So, who’s with me? :) The desktop summit is only a few days away, and surprisingly I will make it (yay!) so I’d love to have a conversation about this there. It’s the perfect place, after all, for something that aims to be cross-device, cross-desktop and all.

TL;DR: a resource-based session protocol would let us do really awesome things, so it’s worth the effort to replace XSMP

Now, I’m going to go into technical details below the cut (I have enough material to write a paper on this but for that damn writer’s block…)

So, technical details. right. What actually needs to be implemented? What API might it need?
There are three parts to it, all interlocking. First, the common storage area. This can actually come before the rest, because XSMP doesn’t say anything about how session data is stored; apps can start using it and still be started by an XSMP server, which is great for the transition time (it’ll be long. nothing this big changes fast).

Anyways… I haven’t done much research on storage. It could be stored in nepomuk/tracker/zeitgeist, or in text files like kconfig, or even in a registry (eek) – all that matters is that the system does its job well, and that all the apps on the device are using the same one. It might turn out that different backends are better for different devices – I don’t know, and I would love to hear information about the pros and cons of various approaches. Apps will be updating certain data on a regular basis – page number, position in a movie, etc – so I’d like numbers on efficiency too.

I do at least have a rough sketch for the format of the data, though. The vast majority of resources will be a URI of some kind (usually local or http) and a position within the document. the uri becomes the key, the location one of the fields. other fields will be the window it’s in (after all, windows are still the unit that window managers work with, and their position and size is important session data too), the program last used to open it, custom data fields for that program (preferably kept to a minimum), and perhaps data from other devices (either to help with decisions where this device is unsure, or in case the session is transported to a third device where that info could matter). Session data for windows would be written out, perhaps by the window manager… Really I need a whiteboard and other programmers to come up with good data structures, so I want to go over this in berlin, and I’m sure it’ll change once again during implementation. :) The most important thing is just to settle on some resource-based structure that works.

Second is the session manager. This is a program – likely a daemon, but not necessarily – that loads up the apps when a session is opened, and tells them to close when appropriate. If there is a user-interaction feature (“would you like to save this document?”), it would handle that too. This is what ksm-server currently does for XSMP. It’s actually not that much work, mostly reading and writing config files, but it’s complicated by the apps’ ability to request user interaction (stalling session-close) or even cancel it entirely.

Third is the application API, the part most developers will see. I want this to be good. I’m also wondering about backwards and forwards compatibility here; the XSMP protocol hasn’t changed in forever, but a new protocol will have changes, if only to fix bugs or correct design mistakes. Most apps will be sharing the same dynamically linked library, but should we worry about statically-linked ones? Hopefully we can come up with a design that is fairly robust against such cases. I’m reminded of dbus here; it specifies its wire format, which has the advantage that all apps speak the same language forever, but the disadvantage that they can never improve that format ever. :/

Speaking of dbus, one of the things I’m wondering is, how should the session manager and apps communicate? I think the storage should be written to as directly as possible, for efficiency (write-write conflicts will be exceedingly rare, if they happen at all, and I expect the same for read-write conflicts). We might want some sort of choking if the filesystem’s buffer’s aren’t enough; as a worst-case scenario, consider a dozen apps each updating their state every 10 seconds, out of sync; that could be more than a write a second. Hopefully there won’t be more than one or two apps doing regular updates like that (you can only watch one movie at a time, and even when reading there’ll only be one source of background music) but it’s something to consider later.

Whoops, I got side-tracked; I was meaning to talk about communication with the session manager. While upgrading ksmserver and kwin to support sub-sessions (aka activities), I discovered the hell of dbus timeouts. I’d rather not go down that road again, but I’m not aware of the alternatives. XSMP uses some ancient protocol called ICE, which seems to have close ties to X11; it’s not the only protocol with that name, though, so googling is a bit of a pain. In any case, I’d like to sever the ties to X11 so that this can be easily used with wayland too – what do they use for communication?

Wandering onwards (my wrist’s getting a bit tired now despite breaks), the api offered is partially defined by the features offered. I think that the first version should omit all but the most vital features; some, like user-interaction and cancellation of the session-close, I’m leaning towards abolishing altogether. This is 2011, not 1990; apps ought to be capable of behaving sensibly when it’s time to quit. Even kate now has swap files to recover data after a crash; those can be used just as well to restore a session. The only downside is they’re not so portable… that’s a problem for later, I think. :) Heck, I might even leave out the ability to put a window in N sessions at once; it does so complicate things, I’m not sure if it’s actually a worthwhile feature in activities, and it would still be possible to have the feature in activities by creating extra sessions under the hood.

So, what features are really needed?
-either the app or the window manager ought to record the size and position of windows. If it ends up being done in the app, it should be entirely automatic, within the library, not something the app developer needs to fuss with at all.
-applications need to record that they’re displaying a resource in a certain window.
-they need to record that they’ve stopped displaying it too. :)
-either the window or resource needs to be associated with a particular session when it shows up.
-apps need to be told when to close.
-apps need to be able to restore themselves from session data.
-apps need a method to store custom session data

That’s just the most basic of basics, mirroring XSMP’s abilities. To make the thing actually cool, we’ll also need to:
-tell the app exactly which windows to close, in case it’s spread across sessions
-tell the app whether it should restore the whole session or just a part of it
-allow and encourage apps to store common, portable session data (like the position in a document) in a standard place that any app can use

I’m sure I’ve missed a thing or two, but you get the idea. :)

Actually, there’s a fourth part to this, too: the device sync. How will two devices share their session data? Sending a list of resources and associated data sounds easy, but there are plenty of details to figure out – which resources need copying to the other device, how to manage the change of URI (a file at /home/chani/Documents/foo.pdf on my laptop will end up somewhere else on a phone), whether to try and resolve conflicts… :) I’m pretty sure there are people who have given this part more thought than me, though. And as a bonus, the sync code will come in handy for migration should a distro change to a vastly different storage backend someday.

All this is going to take time to implement, and even more time to be adopted. It’s a multi-year project. But if the people want it to happen, it can be done. :)

TheBlackCat says:

Just closing and closing for later restoration seem slightly different in one key aspect: in the latter, users are pretty much guaranteed to want to have changes they have made to the document or page stored, while in the format case they are not. So I think the API needs to be able to tell programs this.

For example, when you open, say, kate or kwrite you may be using just as a scratch pad or be working on something important. When you close it, it is correct for the program to ask you whether you want to save the changes or not. When you suspend, however, it should not ask, the window should just disappear. Should it just go ahead and save it automatically, or should it be saved to a temporary file and then restored from there? I am not sure, and I think it would probably be better for each program to handle this in its own way depending on its needs. However, in such a case the program needs to know whether it is being closed outright or just suspended so it can behave properly.

Chani says:

yes; session-close and user-close are different events (even if some programs cheat and treat them the same right now :P)

Chani says:

TBH, I hope that one day this will be a moot point, because we’ll have a versioning filesystem and every program will just save your work to disk on a regular basis, and we won’t need save buttons at all. :)

Dakon says:

I don’t think using the filename (alone) is a good idea for the key. Let’s say I’m editing a PDF document in some PDF editor (first user), then I have it open in Okular (second user) and Acrobat (third user) to view the results. And they all may be at different positions and so on. Two of them doing actually the same thing (view) while the other one is doing something entirely different (edit). I fear things will get even more complicated :(

Chani says:

ah yes… I forgot about that while writing this. there’ll have to be something extra in the key to distinguish those cases. perhaps a number would be sufficient (mypdf-1, mypdf-2, etc)? or, another option would be to give each resource a UUID and have the URI be just another field. something about that feels funny though.

Rsh says:

DBUS timeouts? Maybe fix DBUS or go back to DCOP? :p

somebody says:

some people want (and probably work on it) to use systemd as a session manager

take a look at it

Chani says:

I’m afraid those notes make little sense to someone who doesn’t really know what systemd is :)

also, it kinda *looks* like it’s talking specifically about login sessions. I need more than that, I need multiple (possibly overlapping) sessions within a login, for Activities.

somebody says:

maybe you can ask/contact lennart himself for more info on the subject and possibilities

he will probably be able to help more

guest says:

systemd is not portable (Ubuntu, BSD).

somebody says:

its works on any linux and i don’t think we should care much about an os that doesn’t care about the desktop much

guest says:

KDE is the default BSD desktop.

guest says:

Why do you need to rewrite or to replace the XSMP? As a first approach, you could propose a new property(-ies) for XSMP’s SetProperties response that will store filename, URI, etc, (i.e resources, not the program path and arguments as for RestartCommand), suggest policy how to use this property and make the applications to store the resource information after SaveYourself request. Or you can make the applications save the name of some metaprogram (with some standard, well known name) that will restart the proper application on your hypothetical Phone basing on MIME type. I think there are many solutions.

it’s better to update XSMP (preserving old functionality to be backward compartible) rather than drop it.

Chani says:

Well, one of the reasons I mentioned was to break ties with X11, so that it’s more useful to wayland.
The other is that I’m talking about changing the fundamental design of the system. The whole SaveYourself *concept* is outdated. There probably wouldn’t be enough of XSMP left to really call it XSMP in the end.

I’ve already extended XSMP (unofficially, with dbus calls) and I have learnt that it’s not enough.

Backwards compatibility can be done by running a session server that supports both protocols, same as Wayland runs an X server for backwards compatibility.

guest says:

Wayland on desktop is still questionable. My position (somewhat radical) that Wayland is a diversion, bad diversion. X works fine here and now. From my point of view, this is not practical to break the things that works and it’s not practical to drop well-known and well-described ecosystem. Actually, the re-designed XSMP (say, version 2) could be later ported to Wayland. The relation between XSMP and X11 is not so strong, XSMP even uses different protocol. If you propose XSMP2 (based on ICE), it quickens the adoption of new principles (resourse-based) of session management and probably simplifies the political side with X Consortsium – dark side of the Force, and toolkit teams. :). Otherwise you risk to lose your time. Today X is the only solution for Linux, BSD, Solaris desktops, so the right strategy today, IMHO, is to provide real solution, i.e. not hypothetical one for hypothetical Wayland. :)

Proposing the new versions of X extensions is usual. Look at the X Input v. 2 – it has many new features that prevent their inclusion into X Input v. 1. So, nothing to fear.

Chani says:

hmmm. I see what you’re saying, although I don’t think kwin’s wayland code is “hypothetical” ;)

ideally, I’d love a protocol that both X and Wayland can use; this needs some more thought.

[…] Этот пост является переводом вот этой написанной сегодня статьи. […]

Artem says:

What about one small step further ? If application is able to describe its state as set of simple terms ( URIs, keywords, so on ) then why do it only on closing ?
Use case 1:
I am goind to work and wan’t to take my browser session to my Android.
–Solution( current ):
Close firefox/Save firefox, move session to mobile and restore it there.
–Sequence in details:
Application give us a set of terms that describes it’s state -> we save it to file/nepomuk/zeitgeist -> copy/send it to Android and it is loaded there.

Use case 2:
Now I want to take only one window of of 3 with me/ 5 tabs out of 120.
–Sequence is:
…. -> save it to file/nepomuk/zeitgeist -> somehow extract only terms that are necessary to describe only 1 out of 3 windows/ 5 out of 120 tabs -> copy/send it to Android

I think that generatl idea should be a little more common:

From application point of view:
Recieved a call asking to ‘describe’ some part of the application into set of terms ( URIs etc ) -> Return this set of terms or deny if not possible

From system point of view:
(User shutdown session|User make an action that indicates that he want to move a window/session/application/etc to another device|On timer basics| ) -> Ask application to describe itself as set of terms -> -> Send this sequence to one or more destincation: File, Nepomuk, Zeitgeist, Jabber, Mobile phone via Bluetooth etc -> do some actions with application if needed: close on session-close, nothing on send-to-Jabber etc.

This way the same protocol is used not only for saving/loading sessions even on different devices, but for copying/moving ‘activities’ between any source and destination in runtime, and backup file/nepomuk/zeitgeist is just one of the possible destinations and sources.

Chani says:

:) you missed my comments about needing the protocol to ‘autosave’. You’re still thinking about an application saving session data on close, or on some other event: what I want is for it to save session stuff *continuously* so that we always have the data. That way we can grab it at any time, which means not only the things you said, but also wonderful happy near-flawless crash recovery. :)

Artem says:

Nope. I think I just didn’t understand it correctly. But – don’t blame me – I want to understand your idea, compare it with mine and make it clear for me. In you article you speaking about protocol as session protocol and (as far as I understand ) you include stroring location and some other ( unnecessary in my opinion) things into protocol:
-tell the app exactly which windows to close, in case it’s spread across sessions
-tell the app whether it should restore the whole session or just a part of it
-allow and encourage apps to store common, portable session data (like the position in a document) in a standard place that any app can use

If I am wrong, then you may skip the next part.

I mostly suggest to split it into 2 different protocols – “serializing/deserializing” and “managing serialized data”( again if it is looks like what you have suggested then you should skip this part and forgive my stupidness ).
In short ( refering to mentioned in article 3 parts )
1. common storage – Not in protocol, implementation defined.
2. session manager – Not in protocol, implementation defined. Defines №1.
3. API – reduced, only core things are in protocol 1, all other is left for session manager( and protocol 2)

<Here I will give the description of this separation. If I understand you wrong and all of it was in you article, then you should skip it sorry >
Protocol 1 that ‘serialize/deserialize’ application to/from the set of terms. That is the protocol that application must be aware of. Serialization request should not demand application to do anything with itself – no force closing of the windows etc. And application do not write terms to the file – it simply “sends” them back. Deserialization should just load application from given set of terms. It do not read them from predefined file – it recives them somehow ( probably as command line arguments or via temporary file). Protocol itself is a little bit more complex that just that – for example we may need for one application to attach it’s term to the terms of another application – this is how ( I think ) should work windows manager. It will attach screen/position/size/etc of the window to the data of the application itself.

Protocol 2 that is not the [only] protocol, but rather application stack. ( I will refer parts of this stack as services to not confuse with user applications ) In this stack there are services that will periodically send request to the applications and save recived terms to some predefined location ( that serialized application itself is not aware of, but services from stack are aware ) – “autosave” you have mentioned, service that provide user the ability to save session, restore session, select only part of application/session to be restored, service that is responsible for sending set of terms correctly to another device( most advanced, as it must parse the serialized terms and transmit not only terms, but whole closure( files etc ). The stack is responsible for handling situations when ‘pdf-editor’ is requestd, but there is no application that can edit PDF on the system. The stack is responsible for syncing stored session data with web ( like Firefox Sync ) etc.

To tell the application that it should now close this window and this window, or should ask user about saving data and prepare to die in pain we _must not_ use the protocol1 from previous paragraph. Because it will make it useless/hard-to-maintain on system where there is no way to close an application or there is only one possible window.

Most systems will have partitially specialized stack. Stack on Android, for example, will simply ignore all terms belonging to window manager and [semi]automatically download missing application from Market.

I think that protocol1 is most important part, because after it is implemented, we can add more and more features to the services stack or easily change existing without requiring changes to application – constant autosaving, sending application through Jabber to another machine, automatically sending files to Dropbox/ownCloud during terms transfer to make them available on mobile devies etc.

Chani says:

Common storage in a standard format is important: First, I want different applications (of the same type) to be able to understand the same data. Even your android appstore won’t have *every* app in the universe (I’m pretty sure it doesn’t have kpdf right now ;). The user generally cares more about their documents than about which brand of app they’re in.
Second, polling for autosave is very inefficient. applications ought to just update their session data themselves, when appropriate. There’s less chatter that way. Especially since 90% of the time, only one or two apps will have anything new to say. Plus, you avoid all the annoying wait-for-each-app-to-respond code.

Also, the things you consider useless for android might not be forever. maybe someone will want more than just a login session one day, so they can separate work and personal stuff. Although, making them an optional part of the session manager shouldn’t be hard. I expect the session manager to have much more flexibility in general – a standardized data format means it can inspect the session and choose to do something other than blindly starting all the apps.
The inability to close apps is something to consider, though. o.0 I’d forgotten about that little quirk of android. What *do* you do with an app when you’re done with it and want the resources for something else, or just want to stop being distracted by it?

revnar says:

“systemd is not portable (Ubuntu, BSD).”

It works with Ubuntu (they’ll switch in the future) and bsd is not Linux, so why should we care? systemd will become default init in the Linux world, so it should be a prority to make KDE to expose systemd and Linux features.

“KDE is the default BSD desktop.”

But bsd is not the default KDE system. There’s no single reason to support some legacy OSes.

Chani says:

Well, I’d prefer not to leave out BSD, but it’s probably a moot point:

-can systemd close a group of (not all) windows when the user presses a button?
-can systemd reopen that session when the user presses a button?
-can systemd tell an app “close this window, not that one”?
-can systemd find and use an alternative app when the last-used one isn’t available?

hmm, I wonder if last-used is even the right concept… if I switch my default web browser from firefox to rekonq, do I want all the webpages stored in my activities to switch browsers too?

oliverhenshaw says:

systemd is concerned with processes not windows – but interestingly also seems to be toying with the idea of an application as a process associated with a .desktop file.

The interesting parts of systemd as a user sesssion manager in this context seem to involve
1. putting all processes in the login session in a cgroup (and systemd+cgroups seem to do nested heirarchies quite well, so sub-sessions may work well)
2. process lifetime management, partly thanks to cgroups.

But I’m not an expert, more of an observer here.

anon1 says:

KDE is portable. It runs on Linux, BSD, different Unix’s, Windows, OSX, …
systemd is not portablel. Not even across Linux distributions.

X11/Wayland (KWin), Xine/GStreamer/VLC (Phonon), … KDE always made sure to not depend on a single optional technology that has strong competitors and may replaced with something better in near future.

Systemd has init.d and upstart as alternates and both of them are more used then Systemd. Systemd needs to prove it can compete with the alternates, at the moment it cannot, and even if it can maybe someday then the alternates will not go away. That are facts.

Ben says:

systemd seems to be working quite fine on my desktop, and I fail to see how it’s not portable across Linux distributions. It even has SysV Init fallback support, so it can be used (more or less) drop-in (though, you’d certainly have a better experience with native configuration files, it’s still decent without)..

You can install (it won’t be by default…yet) systemd on Gentoo, Arch, debian, Mandriva…. It’s supported well (and default) on Fedora, but certain other distributions are seriously considering it and bringing up support at a reasonable pace.

I think systemd has a dbus interface, though I don’t know if it’s used for the parts that KDE would need to support. I won’t argue that KDE should support systemd though, since I don’t know anything about that.

robbat2 says:

One of the concerns I have with all the systemd talk here is everything depending on it, and preventing choice of init systems. Alas, it seems udev may end up depending on systemd :-(

Comments are closed.

et cetera
%d bloggers like this: