MSc Dissertation - The device

Functionality

The device will fulfil two somewhat separate functions; firstly, it will assist the user during the main part of his job - patrolling and dealing with incidents. Secondly, it will collect information during this time for use afterwards - for writing reports on the events of the preceding shift.

Information retrieval

Information retrieval will comprise the recall of small amounts of key factual information, with little or no effort by the user. The device is not intended to retrieve large amounts of information, like reading from a book; the object of the device is to provide the user with crucial facts at the salient moment.. The primary factor here is the concept of the minimal user request. The main user input to the device will be one button. This ‘help-me-out’ button will activate the unit to deliver the context relevant information via the ear-piece. Beyond the moment where information is provided, the device’s operation will be transparent to the user.

In order to achieve this, the device must operate independently, attempting to maintain its own awareness of the external world. When the user asks for help the device will already be aware of its current and previous locations and have already selected the most likely user action for this circumstance. Asking for help simply delivers whichever factual information is most likely to be useful at that time.

Each location will have factors associated with it:

doors will be associated with their code or key number
alarm panels will be associated with their location and code
locations will be associated with adjoining locations in the form of a map

Information will be selected by comparing the user’s location with the expected tasks - see the task analysis (see Figure 2 in Chapter 4 - ‘The Design’). By examining the user’s location the device can select the appropriate response. Examining the task analysis as far as accessing the alarm panel gives a flow chart for deciding which response to give (see figure 5).

Figure 5 - Flow chart for response selection

It is recognised that the proposed system will not always select the appropriate information for delivery to the user. However, as computers evolve away from the desk-top and the spread-sheet towards a more human-centred presence, with wearability and ‘intuition’, so they will begin to deal with the world in terms other than black-and-white. The ‘exact answer’ will become the ‘best guess’. This margin of ‘error’ is inevitable, given that the device cannot read the mind of its wearer. The important point is that by maintaining an awareness of its surroundings, the device should give the required information on the majority of occasions. The existing two-way radio link to a control room will remain as a backup. However, the use of this device by all security officers in a team should greatly reduce the load on the central controller allowing for much shorter query response times.

Event recording

Event recording will comprise the recording of a still picture, a sound clip (five seconds in the prototype), location, date and time. The device will be activated to record these details by certain environmental events as shown in table 1.

	Picture	Sound	Location	Time
Entering a new area			*	*
‘Help-me-out’ request			*	*
Face shape recognition	*	*	*	*
Radio transmission	*	*	*	*
Manual recording request	*	*	*	*

Table 1 - Information storage by event

Ideally, the device would take regular pictures and sound clips during an incident. However, there is no easy way for the device to be aware that an incident has begun; nothing consistently happens that could be observed. One possible idea is to monitor the security officer’s pulse rate, although this may not change greatly during some types of incident. For the hospital users there is another possibility. Many incidents involve having to physically restrain someone. Therefore if the security officer considers an incident is imminent he will usually put on his gloves as a precaution. A simple holder for the gloves, with a sensor, could inform the device that an incident has begun.

"When we touch someone we wear these [leather gloves] ... because you don’t know where they’ve been" (Simon Calloway, see Appendix C1)

Hardware

In order to achieve the goal of transparency (see chapter 3 - ‘Design Background’) the device will require a number of sensors through which to maintain an awareness of its environment. The same sensors will allow the capture of information for the PC/Mac based retrieval software.

The final cost to the user is an important factor., given the possibility of commercial exploitation of the idea. Therefore the device needs wherever possible to use hardware currently in production.

The original concept envisaged a distributed system of physically separate hardware devices logically connected by radio. These devices would be:

a digital camera and microphone in the form of a wearable object, such as a tie clip or brooch
an ear-piece
a control unit worn on the waist

However, investigations on two fronts point to the use of one unified device. Firstly, the cost and practicality of using a distributed system does not look promising; secondly, and perhaps more importantly, the users appear quite happy with one unified device worn in a shoulder holster. The only external element would be the ear-piece which would be wired to the control unit.

Physical properties

The control unit is effectively limited in maximum size and weight to that of the existing two-way radio systems the security officers currently use. As the control unit will contain the digital camera and microphone it will need to be worn in a visible manner. This will expose it to the elements; it is therefore important that the control unit is completely waterproof. Similarly, the security officer’s job is an active job - the device must also be fully shockproof.

Location

"If a computer knows merely what room it is in, it can adapt its behaviour in significant ways without requiring even a hint of artificial intelligence." (Weiser, 1991, pg68)

One key factor in the device’s performance is its knowledge of where it is at any moment. From this simple fact it will be able to infer a great deal. For example, if the security officer presses the ‘help-me-out’ button while standing at a door for which the device holds an access code then in the absence of more pressing information it would be sensible for the device to give that access code. (More pressing information could be the fact that the device is in the process of directing the security officer to an alarm panel and standing at the same door under this circumstance probably means that he took a wrong turn).

An on-board system will keep track of the current location. For larger sites it would be beneficial to incorporate a Global Positioning System (GPS) receiver into the control device. For smaller sites, and for indoors where the GPS signal is too weak, a simple system of bar-coded signs at strategic points, such as doors and alarm panels, will provide location information. Output from the device’s camera (see below) will be analysed for bar-coded location information.

Visual

The on-board camera will be capable of medium resolution (eg: 640 x 480 pixels, 8 bit colour), with a fairly slow frame rate (eg: one per second maximum).

Software will monitor the visual field for consistent appearance of a human face shape (over several frames - three in the prototype). This will allow the device to be aware that the user is engaged in conversation. Studies have shown this to be practicable. For example, Claus Neubauer (1998) reports on experiments with artificial neural networks. He used an image of only 32 x 32 pixels (less than 7% of the proposed device’s captured image.) Despite this, the network correctly found the presence of a face in the image 87% of the time, with only 2.5% false positives.

Thus the device will regularly examine its environment visually via the digital camera (once per second in the prototype). Each image will be analysed, with a search made for:

A bar-code containing location information
A human face shape

Audio

The device will have a microphone for recording ambient sounds, especially speech.

Time

The device will record the date and time along with each stored event.

Telecommunications

The device will be required to provide information about door codes, alarm codes, key numbers and the layout of buildings. It must also store the captured images and sounds. In achieving these goals there appears to be a definite requirement for the device to be in wireless communication with a base computer:

The users interviewed were very worried about carrying a device that contained every code for every building. No amount of promises of password protection, PIN numbers, etc. seemed to fully allay these fears. Indeed, the hospital users take this problem so seriously that they do not carry any codes on their person; when a code is required they radio a request to the control room who send the appropriate code to the security officer’s personal pager. Even the University users, who already carry this information in a paper notebook are unhappy about a computer device containing this information. It would therefore appear that holding the codes centrally and sending them to the remote devices as required may be important for user acceptance.
The users are worried about not having the most up-to-date code at any time. If a code is changed, it is important that the new code becomes available to all security officers very quickly. It may be acceptable to wait until the end of a shift, but given that a shift lasts 12 hours, accessing the (new) code over a radio link would be a distinct advantage.
With the university security users it would be possible to maintain a copy of all relevant codes and maps in each device. However, this would not scale-up to other users, such as the police, who have to deal with a huge variety of situations.
The users are concerned that they may be endangering their own safety if it is known that they are carrying a recording device. It is possible that people may attempt to take or destroy the device in order to remove evidence that may later be used against them. If the recordings - especially the images - are sent to secure external storage soon after capture then this problem would be minimised.
With a communications link to a base computer, it would be possible to maintain a display of security officers’ locations. This could be useful in an emergency. However, it should be noted that the officers themselves do not appear too keen on this idea.

Storage

Storage requirements have been calculated as follows:

5 secs of 8-bit audio sampled at 11KHz can be compressed to approx. 30K.
A 320 by 240 pixel 256 colour image can be compressed to approx. 64Kb.
Each activation will therefore require approximately 94Kb of storage space (assuming that the location and time information will require negligible space). So for example if the device is activated (makes a recording) on average once per minute over a 12 hour shift (minus two hours for breaks) then it will require 600 minutes times 94Kb=56Mb of storage capacity. (Once per minute is not impossible in a busy hospital environment. Also, the device could use the location information to prevent unnecessary recording, such as in the control room for example).
56Mb is relatively small by today’s standards. It will easily fit on a low-cost hard drive, but this could create problems with power consumption. It could also be held in RAM (64Mb of SIMMS currently retails at approximately £40). However, using a volatile medium would produce a significant risk of occasionally losing the data in the event of a battery failure. Other non-volatile memory is available but cost may be a problem. Bearing in mind the users prefer not to keep the captured information on them, an alternative would be to send the data back to the central computer via the data link. A modem would be able to send 94Kb in a few seconds. The device would need to buffer the information in memory, but even sharing a communication channel with several other devices the recorded information should usually be transferred within a minute. This is perhaps the best option. However, further investigations are required, particularly in the areas of cost and funding.

Power

Between shifts the device will need to be attached to a charging system. Each shift for the university security officers lasts 12 hours, with 2 x one hour breaks. In theory the device could be recharged during the breaks, however this would be rather inconvenient for the security officers. The batteries therefore really need to last for the whole 10 working hours.

Processor

The processor(s) must be capable of digitising and compressing mono audio, controlling the digital camera and compressing the images, speech synthesis (or selection and production of sound clips held digitally), face-shape awareness, bar-code reading and communication with a central computer via the radio link.

The ear-unit

Devices are currently available for the security industry that provide the required functionality and are comfortable to wear for extended periods. It is proposed that an existing wired ear-piece be utilised. The ear unit will provide the means of obtaining information from the device. This information will be in speech format, using natural sounding, not obviously synthesised, speech.

Radio

The users like the idea of incorporating the existing 2-way radio into the device. Thus, the microphone and ear-piece could act as the transducers for conversing with the control room or other officers.

Software

The pictures, sounds and other information will be stored on an external (to the device) computer. They will be transferred either at the time of capture via a radio link, or by wire when the device is returned to the control room and plugged into the charging system. An ordinary PC/Mac will then be used to review the information.

The hi-fi prototype shows the basic design of screen layout. In particular, the screen is divided into sections:

Top - the timeline as on the paper prototype
Bottom right - the display area for selected information (pictures and input fields).
Bottom left - header information (name and shift) and buttons for the main functions.

The design of the buttons is important in that the icon selected should be as meaningful and unambiguous as possible. Four symbolic icons (Preece et al, 1994, pg 115) were chosen to represent the buttons’ function (see figure 6) (the pictures were taken from the clip-art section of Microsoft Word):

Shift - a diary
Time - a stop-watch
Help - a life-belt
Exit - an arrow changing direction

Figure 6 - The icons used on the hi-fi prototype

Prototyping tool

The hi-fi prototype (Rudman, 1998) is written in HTML. This decision was made on the grounds that certain features of HTML were ideally suited to this prototype. In particular it ensures that the prototype can be distributed to and used on a wide variety of platforms. This allows it to be easily accessed by the many people who will be involved in designing the finished product - a key aspect of user-centred design. In particular this includes security officers at three sites, HCI experts, other academic institutions and commercial interests such as Kodak.

Special care was taken over the decision to use frames; Jacob Nielsen (1996) lists several problems with using frames:

Many browsers are still in use that do not support frames
Frames compromise the usefulness of the ‘back’ button
It is very difficult to print pages that use frames

The important distinguishing feature of the prototype software is that it is intended as a standalone demonstration, to be used under relatively controlled conditions. In particular, all users for whom it is intended will be accessing the software as a specific evaluation task, rather than merely selecting one page from an amorphous web of information. They can be expected to have an interest in seeing the software as it is intended - resizing the browser window as necessary for example. Thus it is not necessary to be able to leave and re-enter the software on a whim. Further, the software is self-contained, using hypertext as an internal tool (such as clicking a picture to see a larger version in a separate frame) - there are no external hypertext links.