Jabber Summer DevelopmentJ. Miller
 The Project
 June 2000

Jabber Summer Development


This should act as a general guideline as to the focus of Jabber development for the summer of '00. This covers all code, protocol, and applications. This is a group document, feel free to make your own additions and updates.


Table of Contents


1. Introduction

If you're reading this, you shouldn't be here! This is for active developers working on Jabber, you know who you are :)


2. Protocol

2.1 Presence

2.1.1 Devices

Available presence implies that the USER is available for messaging, what if you want to communicate with a connected device where the user may or may not be available, such as a pager, wireless palm pilot, [cell] phone, alternate inbox, appliance, or some other "device". You may also use seperate providers for work or home, and have a forwarding resource that each account broadcasts so that users of either can choose to message you at the alternate account.

To do this, a new presence type is defined as: <presence type="alt"><status>Pager</status></presence>. Clients can use this to build a list of the users own resources, maybe offer "switch to or check alternate inbox", etc. Any client should display the sender of these presences as alternate resources available for direct messaging. In the future an <alt>devicename</alt> may be added for "programmable" access to what is actually the recipient.

for now, these alternate resources would be represented and managed by specific rules as configured by the user/client for mod_filter.

2.1.2 Errors/Redirects

presence type='error' is sent in response to presence s10n requests only (not for normal transient presence types). 404 for non-existant users, 502 for remote server errors, etc.

302 redirect can also be sent for two reasons, the first is in response to a s10n request (possibly an agent trying to "rename" the jid being subscribed to). 302s can also be sent at any time for account forwarding, so a user can transparently change addresses and notify all the subscribers of the change. this 302 can be sent any time, and is important for the client to receive and handle (prompt the user to alert them and so they can accept the new jid), so it is cached by the server similiarly to s10n requests. the cached 302 is cleared when the client renames or edits the item in question in any way (i.e. it's in an iq:roster set). this means that the server will need to support <item jid='' subscription='replace' newjid=''/> which preserves the s10n and all other settings of the item, just renames the jid.

2.1.3 Location

have a standard jabber:x:location namespace, for lat/long, address info, etc... for roaming clients, cell phones w/ gps, laptops. any xml dtd out there yet?

2.2 Messages

2.2.1 Envelope

x:envelope support. mod_filter would use for loop checking by adding forward-by headers, server would expand to/cc/bcc, for email-style client

2.2.2 Filtering

mod_filter provides a way for a user to 'filter' incoming messages based on rules, set by the user. a rule consists of one or more conditions, and zero or more actions. rules are checked in order, and stop checking when a matching rule is found (unless an action is <continue/>.

current conditions








current actions






No action implies to drop the message. if no rules match the message, it will be delivered to the user. (or if the chain continues past a matching rule)

Future conditions: time, size, (others?) future actions: edit, error, (others?)

2.2.3 Events

for additional UI, specific message events can be exchanged between entities. the four events are: composing, offline, delivered (to client), displayed (made visible to user). to receive the events, in the original message you include <x xmlns="jabber:x:track"><offline/><displayed/></x> in the original message. events are sent back as <x xmlns="jabber:x:event"><displayed/><id>msg41</id></x> which represents the event(s) and original message id="" attribute. (<thread>...</thread> elements should also be sent back for window tracking, etc).

the composing event is sent when the user is actively typing a reply to the message, if the user never completes/sends the reply (or after an idle timeout) an empty x:event (just contains id, no events) would be sent to "clear" any outstanding events.

this is NOT required, and events should not be sent unless requested. the server will have to support offline and delivered events

2.3 Agents / Browsing

oh boy, let's see if we can't just re-invent half of jabber :)

The basic idea is this: the jabber world is a diverse place, with lots of services, transports, software agents, users, group chat rooms, translators, headline tickers and just about anything that might interact on a real-time basis with conversational messages or presence. Every jid is a node that can be interacted with for messages, presence, and special purpose iq namespaces. Some jids are parents (transports), and often many jids have relationships with other jids. We need a far better way to structure and manage this culture of jid stew. The answer: BROWSING

I'm not sure how to explain this, it's finally coming together in my head but it'll take me a few tries to explain it logically (read: big fat warning if you read further that you'll be irreversibly corrupted). Think mime types, foo/bar, a common way to logically digest the world of document and media types. Along those lines, let's define a two-level system similiar to mime types:

transport / [icq, aim, irc, smtp, ...]

conference / [private, irc, topic, url, ...]

user / [connection, inbox, forward, pager, device, ...]

application / [chess, whiteboard, dia, abiword, ...]

headline / [rss, stock, logger, ...]

render / [en2fr, sp2ru, jive, tts, ...]

keyword / [dictionary, thesaurus, faq, google, ...]

Now, taking this rough list, let's apply it to an XML browsing environment. Assuming your "home page" is your server, you send a jabber:iq:browse get to the server (leave out the to="" attrib). The server then responds with <transport jid="" name=" Public Development Server"> <transport type="icq" jid="" name="ICQ Transport"/> <conference type="private" jid="" name="Private Chatrooms"/> </transport> ...

... (switch to high level description mode, it's getting late) When brosing each entity is represented by a node named by the top-level type and an optional type="" attrib for a specific sub-type. You "browse" to your server, it contains a big list of all the entities available or advertised from that server. Clients can just do an iq get and include a <transport/> to ask for just the transports (and include a type attrib for more specific list), if that is all they support in their gui. Every entity in browsing can have a description and uri element, and two flags: browse and search. browse means you browse to that that jid as another branch.

Instead of using "alt" presence, let's push browsing. a client can browse to a user, and get back a list of entities for that user, which would normally be user/* things but could be other stuff as well, particularly application/* for apps that user has running. apps can browse for just apps to other users, etc.

Expand the roster into a bookmark. the client can store any of the above entities in the roster, and include custom xml namespace data within them. The client could store favorite chat rooms, always-enabled renders, headlines, or special remote transports that you want made available in addition to the default server ones.

Implementation notes: the client shouldn't just dumbly follow browse-to-jids and keep requesting, but all returned browse queries should catalog based on the key jid which is unique, and any request from a gui would ask this catalog, which would either use existing data or request for more.

everything should respond to vcard, even if just a simple response, including servers/agents/transports/etc

2.3.1 registration

simple yet more customizable registration mechanism. look at XForms for the high-end insanity.

2.3.2 search

same prob as registration, and have we documented both styles of search results? also, would searches return browse entities or instead only be specific to the parent entity of the search?

2.4 Auth

2.4.1 negotiation

client would send preferred language during login, for server-generated content?

2.4.2 redirect

302 redirect to new IP, smart load balancing

2.4.3 server digest problem

The problem is simply: it is very common for passwords to be stored in a non-reversible format (digested, modern unix password files, md5 database tables, etc). Due to this, the ONLY way to authenticate a jabber client would be to supply the server with the password in clear-text so that it could perform the logic to transform it. Therefore, for true security the transport layer needs to be secure to protect the clear-text password. The options so far are SSL and SASL, SSL is already supported, if someone could evaluate SASL and the complexity involved in supporting it for clients and the server, and the overhead required, etc, that would be most helpful :)

2.4.4 PAM-style

PAM specifies a generic way of prompting a user for arbitrary authentication information, usually the password but possibly additional information or binary information (from fingerprint scanner, etc). Because PAM usually talks to systems that use the above scheme, it often requires the clear-text password, so fundamentally we need to solve the above problem first. xkahn has a starter proposal here, which needs some slight xml tweaks to fit w/ the rest of the protocol xml style. this probably isn't going to be agreed upon for this summer, but keep it in mind for fall development :)

2.5 OOB

2.5.1 alternate xhtml body

the general (happy) concensus is that there will be one standard alternate body which is used if available. <message><body>hi</body><html xmlns=""><body><h1>hi</h1></body></html></message>. clients can optionally support <eref target="">xhtml-basic</eref> with table support being super-optional and adding in <font/> support for simple text styling.

all the textual content must always be represented in the normal body for non-supporting clients. if after sending xhtml with a thread, and the thread being returned, the client can stop sending the xhtml (because a client supporting xhtml would return the thread in the next message and use xhtml). this detection logic isn't required, but just suggested for efficiency reasons.

2.5.2 offline / 3rd-party (fw) transferrs

from pgmillard: basically, you connect to the host specified by the user and send a "PUT /myuserpath/foo.png HTTP/1.1" command, plus the Host: and Content-Length , etc.. std headers. then just send out the file in chunks. the Apache module that I'm testing w/ has checks for IP's, auth, etc.. (and there's a few public WebDAV hosting sites already) So from a protocol side, I send the file to the WebDAV Server first, THEN send the recipient a normal oob iq tag like normal, but with the remote URL instead of mine. so from the receiver side, it's transparent.

the only stickler here is how long does the file reside on the remote server? the client can track all the stored files and use some other method to expire/remove them. a possibility: add an expiration extension to mod_dav

2.5.3 direct socket negotiation

just use other URIs in OOB? like for direct xmlstream between clients xmlstream:

2.5.4 session negotiation

for starting applications, games, etc...

possibly use jabber:iq:start and some standard fields. similiar to iq:oob but for starting apps and exchanging critical values (ip, port, app start parameters, etc)

start direct xmlstream sessions too, for char-by-char chats, shared xml events/dom, private messaging, etc.

overlap with SIP?

2.6 PKI

use pgp/gpg and existing PKI. put Key ID or URI to it in vCard (where?). susceptible to easy replay attacks in chat or presence?

2.6.1 signing

include this in messages or presence: <x xmlns="jabber:x:signed">...hash of body or status...</x>. always sign presence, indicates that secure/encrpyted chat is available.

2.6.2 encrypting

include this in messages: <x xmlns="jabber:x:encrypted">encrypted body</x>, normal body might contain a message warning that it was encrypted.

2.7 Server

2.7.1 Dial-Back

ensure valid originating domain name of servers, reduce spam

2.7.2 Tracking

sending server IP, path tracking


3. C Libs

3.1 libxode

3.1.1 xmlnodes

serializing - amount of benifit?

2str efficiency

cdata representation, building efficiency (escapes)

hashes for data structures


internal void* attrib value type? use xmlnodes internally to pass pointers?

3.1.2 strings

jabber is a string HEAVY server, so go to extra lengths to make it more efficient?

global hashes for common strings?

jpacket hints (I forgot what I meant when I wrote this, maybe I'll remember eventually :)

refcounting strings?

garbage collection

using const char * more frequently

3.1.3 xml parsing

write own? (possibly for embedded systems)

don't use any other higher-level xml stuff, does it add any overhead? would jabber-specific one help?

tokenizing directly into a hash/refcounted string system

ignore (not fully parse/store, just tokenize/validate) other namespaces

on the fly parsing

(malloc perf, no mem sleep?)

3.2 libjabber


4. Server (jserver)

4.1 core

MAPI additions


Heartbeat (cleanup, timeouts)





User Management



Config - Reloading


(race w/ file io?

4.2 Modules






4.3 XDB









5. Server (etherx)

Some big rumblings going on here... Possibly, repurpose etherx as the 'jabber administrative server'. It's a smart daemon that loads a config file and manages components. (this sounds a lot like an orb, scary) These components are all simply transports, they can send and receive xml packets, (c ones would use a library to abstract this out). The daemon would start the configured transports (shell/fork) and monitor them, possibly restarting. (use STDIN/OUT instead of socket?)

The current thoughts are to move the existing service api in jserver OUT, so that any trusted transport would be able to manage the user connection and tell the session manager process about that user, and handle their incoming/outgoing data. This means the IRC, HTTP, Telnet, WAP, or any other client-access gateway can exist in it's own process or on a seperate box.

The new next-gen admin svr daemon would also be able to link to copies of itself on other boxes, and send packets to the right session manager process, allowing a server-farm to scale identically to an http server, just adding hardware (linking in additional session managers).

Transport administrative functionality could be hosted in the admin svr, such as error messages, current admin statistics, flags/settings, a live editable 'registry' (transports could store settings in it, would be fed on startup, etc).

transport presence (maintaining s10n's, notifications, etc), (thinking like apache/http-server model, docroot, uris, /transport)



6. Server II

Jabber Server II Architecture (draft)


7. Groupchat

This section contains a mix of protocol and implementation items

topic (use subject) In private chat, any user may change the topic by sending a message with a <subject/> tag in it to change the topic (possibly a blank subject to display the topic?) which will cause a server message displaying the new topic to all users in the channel. when a new users joins, the server will send a message with the topic in it (in the subject tag) only when a users enters the channel, and when the subject changes..

identity (finding/exposing real id, vCard?) Server will assign a user a gid (guid) for each user. this will be used to identify a user within groupchat... this can be stored (with the new server achitecture) in the users iq:private namespace possibly.. or stored by the gc server in some other fassion.. any user<-->user interaction will take place using the guid's.. so the users don't know the real jids of the other user... this could be as simple as group@server/gid where /gid could be some complex hash that hides the user's identity. the server must pass through any messages or iq requests through to the other user's real jid. this keeps the users' anonymous on the server, but a user will always have the same gid, to keep accountability.

registration, in roster? currently there are two implementations of registration with a groupchat server.. the "irc-t way" and the "gc-t" way.. here is a brief explanation of both.. in irc-t, a user registers with the server with a nick and a server to connect to. irc-t logs the user into the server, BUT DOES NOT JOIN THE CHANNEL... the user may then add irc users' (nick names) to his/her roster through the normal means, the jid would look like: and the irc-t server will send presence to the user if the nick is in use on that server... messages can be sent back and forth between roster items, as normal.. in the gc-t method, you register a nick and channel to join.. gc-t joins the channel for you, (you are in the channel) and you see presence in the channel as resources of the transport jid.. of people who are in the channel currently... you may send messages to these users as normal, by sending to the specific resources.. you cannot however, see the messages from the channel (unless the client shows you these messages) and can send messages to the channel by sending to the server jid (no resources) IMHO, this method will break the current client's implementation of groupchat... a method needs to be decided on, and standardized, so that all groupchat implementations have the same registration functions.

mailing list mode? the way this would work is simple.. users would register with the transport.. (maybe a separate mailing list transport??) and messages sent to this transport would go to all users who were subscribed to the list.. almost seems like a mod_groups function.. but different i guess.. =]

custom headlines ?? someone care to explain this?

oob attachments file transfers to other gc users.. or possibly broadcast to all users?

ignoring msgs w/ no body (for in-line games, etc) this should be a mroe global requirement.. all clients should ignore all message with no bodies, IMHO.. so that x tags that are not understood don't show up in the client at all

room redirection a simple twist on the 302 presence error would work well i think..

7.1 meta-nick management

The current groupchat model works well for messages and presence notifications, but something needs to be done about the nick management.. i.e. changing nicks, tracking nick changes, etc... the current proposal is to use and IQ system before you enter the group, to handle nick changes, or requests associated w/ nick/id management.

Meta-Nick Mangement:

To join a room, you start by initiation an iq conversation with the server to negotiate a nick.. you would send an: <iq type='get' to='grp@gserv'><query xmlns='jabber:iq:groupchat'/></iq> .

which could return something like: <iq type='result' from='grp@gserv'><query xmlns='jabber:iq:groupchat'><nick/><secret/><key>234a...</key></query></iq> .

the flags tell the client which fields are required to be filled out, in order to join that channel. and then you send a: <iq type='set' to='grp@gserv'><query xmlns='jabber:iq:groupchat'><key>234a</key><secret>foo</secret><privacy/><nick>dude</nick><nick>dude_</nick><privacy/></query></iq> .

if your nick could not be set for any reason.. (invalid, in use, etc) an error will be returned with the reason.. otherwise the server will return a result with a list of all the users in the group: <iq type='result' from='grp@gserv'><query xmlns='jabber:iq:groupchat'><topic>The Room's Topic!?</topic><item jid='you@yourserver' res='2e3d...' name='dude'/><item res='43d3...' jid='' name='foobar'/></query></iq> .

the res resource in this example is a unique identify for a user, since the name may change during a groupchat session, the res and jid will always remain the same.. this way you will be able to tell who a user is, when you recieve an iq set.. At this point, you have reserved your nick, but you are still not part of the channel until you send your available presence to the group (like the "old" way) once you do, you will recieve presnce and messages like: <presence from='grp@gserv/<res>'><status>lala</status></presence>

<message type='groupchat' from='grp@gserv/RES'><body>lala</body></message> .

Because this format is much different from the current groupchat format, users logging in with the old format will get an error, indicating that their protocol is out of date

You would send an iq set with any changes or updates to your settings, and you would receive an iq set (like the roster push) whenver any item changes (or whenever the topic, desc, or other room meta-data changes?).

You only get messages when you are available, sending unavail will basically "leave" the room, you'll no longer receive messages. Items received w/o presence would not be displayed as participating in the room.

Theoretically, you can use this model as a "mailing list", where you're not chatting with a group of people, but sporadically sending normal messages to the group. (hence a tendancy to want to rename it 'conference' which could be a chat conference or message conference, public, private, etc)

The server could persist the iq set in this mode, and return a flag that it's a mailing list. (issues with unsubscribing, knowing that you're on the list? etc...)

7.2 private (conference)

7.3 public

topic navigation, sub-agents? dynamic tree

group size



7.4 limits



size (messages, participants, etc)

word filtering (for children chats, etc)


8. Web Integration

8.1 JOW

Jabber On Web -- full blown client that requires no server-side (cgi/php/etc) programming, is pure static HTML/JS, uses the Jabber HTTP service. checked into jabber-transport/src/svc/http/examples/jow, has some hard-coded values, VERY prototypish, but functional. see:

8.2 Portal Client

Portal-integrated client, limited functionality (unless you use dhtml and a hidden/i frame). see working mock-up:

8.3 Personal Center

Live presence buttons, and instant message/chat, available from personal home page/etc.


9. Distributed

9.1 JUD

Distribute queries to all JUD servers

9.2 groupchat

have public chatrooms that are tied together across all servers, use URI hash as room name for web-following chats and presence tracking


10. Insert Self Here

Many of you have checkin access to, go ahead and add in your own development plans for your client/transport/etc if you think it would be of value here.


11. Client Projects

11.1 Mozilla




12. Seedling Projects

12.1 source development tool

source/file-based presence and chat

12.2 software tracking

need RSS for software versioning, freshmeat.xml, update notification just like rss

12.3 virgule

distributing independently across sites, site trust metrics, fun expirement :)

12.4 abiword

collab editing, please! also, rich interface for msg formatting, auto chat-into-doc transition

12.5 whiteboard


12.6 voice


12.7 mp3

chats for icecast, gnutella



13. Growth Projects (new and maturing)

13.1 pager

simple pager transport

13.2 rss

RSS, news, stocks

13.3 SMTP

in python, handling attachments, flexibly filtering

13.4 telnet/shell

toy/demo, run shell commands

13.5 toys

jibberish, eliza, fortune

13.6 translator


13.7 utils

notes, calendar (use perl mod for human dates), dictionary, pop checker

13.8 josh

and family

13.9 java transports

similiar to servlets, for IM/P

13.10 vcard

leverage more?

13.11 private

SyncML, bookmarks, config, 3rd party reg


14. Security

14.1 SSL

14.2 Transports

14.3 Info Storage

14.4 Other


15. AIM-Transport

15.1 DirectIM

15.2 Chat

15.3 Profile