Data Collection/Data Capture

Data Collection/Data Capture • Format: • Data collection for mobile devices • see documents on schedule • Some ideas for storage and persistence. • Transferring data to/from a PDA. • Sample XML file (+ code). Tutorial – Data collection via a smart device application using XML • RFID Electronic Product Code (EPC) • Focusing more on the data structure • RFID and databases • Commercial offerings – Object Store • Thoughts for an ACCESS design j.c.westlake@staffs.ac.uk

Recap of RFID readings in week 4 • The tags hold meaningless information unless it can be used to read a database of information • For a commercial solution this begs the use of a database as part of the solution j.c.westlake@staffs.ac.uk

Data storage considerations • For mobile applications, there are additional considerations than for single machine or simple web-based applications. • E.g. • Your PDA client may not be in range of a suitable transmitter to send data immediately. • Signal attenuation may cause data rates to become too slow for large data transmission. • You may have alternatives for transmission within the same device. j.c.westlake@staffs.ac.uk

Data storage considerations • Possible solutions include: • Storing data temporarily on the PDA. • E.g. SQL Server CE • Sending a quick report (and not the full data) back to the main system. • Programming a variety of send procedures into your PDA client. • Not transmitting any data from the PDA whilst outside the office and just use it as a data gatherer. • the data is transferred using a standard link back at HQ. j.c.westlake@staffs.ac.uk

Data formats • There are many data formats but they can all be placed in one of two categories: • simple text files (Comma Separated Values etc.) • binary files. • Both methods have their own advantages and disadvantages. • CSV files contain data that can be easily read but lack a description of their own format. • Binary files contain data and a description of its data format (as a Word document does) but are proprietary schemes that require specific applications to read them. j.c.westlake@staffs.ac.uk

Illustrative example Paper-based or Simple Text File Jonathan Westlake Staffordshire University j.c.westlake@staffs.ac.uk Comma Separated Values Jonathan,Westlake,Staffordshire University,j.c.westlake@staffs.ac.uk Binary – a string of 0s and 1s 0101 1001 0111 0110 0110 0001 0110 1110 0010 0000 etc. j.c.westlake@staffs.ac.uk

Introducing XML (most of you will have done this with me) <contact> <name> <firstname>Jonathan</firstname> <lastname>Westlake</lastname> </name> <workplace>Staffordshire University</workplace> <email>j.c.westlake@staffs.ac.uk</email> </contact> j.c.westlake@staffs.ac.uk

Why XML? • eXtensible Markup Language (XML) is widely accepted way to share data over the Internet in a format all computers can use • It is semantic – provides meaning • the idea is to create a computer language that companies can use to describe products so that the computer can search for products in the inventory • It is interesting to review the Auto-Ids intentions – see next screen grabs j.c.westlake@staffs.ac.uk

XML: Well-formed and Valid • The rules for XML files are much stricter than for HTML. A forgotten tag, or an attribute without quotes makes the file unusable. • It is written in the official XML specification: • Applications are not allowed to try to second-guess the creator of a broken XML file • If the file is broken, an application has to stop right there and issue an error. j.c.westlake@staffs.ac.uk

Sample customer XML file <?xml version="1.0" encoding="ISO-8859-1" ?> <customers> <customer visited=“no"> <id>506</id> <name> <firstname>Reg</firstname> <lastname>Kemp</lastname> </name> </customer> <customer visited=“yes"> <id>329</id> <name> <firstname>Kerry</firstname> <lastname>Gorblimey</lastname> </name> </customer> </customers> j.c.westlake@staffs.ac.uk

The idea of an Auto-Id • A Tag can add a unique identity to every asset, case and item • This identity can then be used to show which item is being read and all of its attributes • Sounds so simple does it not? j.c.westlake@staffs.ac.uk

The enterprise architecture of RFID Information Flow • Each reader sends the tags Id to its system • The system compares this to the local database • If not recorded the Object Naming Service (ONS) is consulted (see slide 22 at end of lecture) • The ONS provides a link to a Physical Markup Language (PML) server where the products details are stored j.c.westlake@staffs.ac.uk

A conceptual operational model • Data transmitted from the reader is represented by a table in a database which sits on a server (or could sit on a PDA) somewhere • The required despatch unit (DU) is selected from the drop down list • The database automatically processes this data and reports discrepancies • As you can see for a firm with lots of RFID parts the amount of data to be processed will be huge and so………..this has led to research into architectures to cope with this j.c.westlake@staffs.ac.uk

“Savant” (to know) databases • The readers need to use the same type of database to determine what the EPC number is and which item it is connected to. • This database is known as SAVANT. • Needs agreement on how the code should be structured • The Auto-ID Center has created software technology called Savant to manage and move information in a way that does not overload existing corporate and public network • Savant uses a distributed architecture, meaning it runs on different computers distributed through an organisation, rather than from one central computer. • Savants are organised in a hierarchy and act as the nervous system of the new EPC network, managing the flow of information. j.c.westlake@staffs.ac.uk

Data classification - choices • Comes down to…… • EPC or not to EPC! • EPC slides follow • Non EPC – a DIY approach or non-standard approach j.c.westlake@staffs.ac.uk

What is EPC? • EPCglobal™ is a joint venture between the Uniform Code Council (UCC) and the European Article Numbering (EAN) Association. • The main organisational body for standardisation of the EPC. • Numbering system with ability to easily incorporate a unique identifier at the individual product level. Source: http://www.symbol.com/products/whitepapers/rfid_key_issues.html j.c.westlake@staffs.ac.uk

What is EPC? • The Electronic Product Code™ (EPC) is seen as the next generation of product identification. • Existing numbering systems include EAN • The EPC is a simple, compact “license plate” that uniquely identifies objects (items, cases, pallets, locations, etc.) • The EPC is built around a basic hierarchical idea that can be used to express a wide variety of different, existing numbering systems, like the EAN. j.c.westlake@staffs.ac.uk

The case for EPC? • EPC (Electronic Product Code) is simply a number, typically from 64 to 256 bits long • Standardized so thousands of trillions of items in the world can be assigned a unique identification number, a unique EPC • Vital so that everyone uses one type of system and not their own code? • Otherwise would lead to great confusion because so many different people need to read the same tag. j.c.westlake@staffs.ac.uk

What is EPC? • An EPC number contains: • 1. Header, which identifies the length, type, structure, version and generation of EPC2. Manager Number, which identifies the company or company entity3. Object Class, similar to a stock keeping unit or SKU4. Serial Number, which is the specific instance of the Object Class being tagged j.c.westlake@staffs.ac.uk

EPC tag format Source: http://www.symbol.com/products/whitepapers/rfid_key_issues.html j.c.westlake@staffs.ac.uk

How Does EPC-RFID Work? • RFID tags (transponders)—affixed to cases, pallets, cartons, units or other objects— begin to transmit radio frequency signals when in the read zone of a stationary or mobile reader (interrogator). • The reader picks up the signal and decodes the unique EPC that identifies, using 96 bits, the name, class and serial number of the product. • This information is then matched with record data in the host computer system and database application, as shown next slide • THIS IS THE BIT WE ARE INTERESTED IN THIS WEEK Source: http://www.symbol.com/products/whitepapers/rfid_key_issues.html j.c.westlake@staffs.ac.uk

What Comms technology does EPC need? • There is more to EPC than just the RFID tag. • An EPC system relies on multiple components: • 1) The EPC RFID Tag - a flexible inlay imbedded within a paper label consisting of a computer microchip with an identification number and a miniature antenna. • 2) The scanner - A device which emits the proper radio signals to activate and “read” the information in the chip. • The RFID chip reflects a weak radio signal encoded with it's identification number when illuminated by the radio signal emitted by a compatible scanner. • The scanner receives the radio signal from the EPC RFID tags and decodes the EPC RFID tag's identification number. j.c.westlake@staffs.ac.uk

What Comms technology does EPC need? • 3) The database • An information archive with the data records linked to the identification number in the EPC RFID tag's microchip. • The information in this database contains the relevant data associated with the microchip's serial number. • 4) A network • A standardised method of sharing the relevant EPC data within a company and with the companies suppliers and customers. j.c.westlake@staffs.ac.uk

A solution overview Source: http://www.symbol.com/products/whitepapers/rfid_key_issues.html j.c.westlake@staffs.ac.uk

Don’t have to use EPC • "Closed" systems - such as a library or the University, use their own numbering system (i.e. not EPC) as books may only be read in that library and not all around the world. • That is why we see many case studies of tags being used in small "closed" applications. • they don't have to wait for standards, but tagging things on a large open infrastructure means there has to be just one way of identifying everything, in this case EPC. • This requires standardisation which in turn needs time for everyone to agree on the right way to structure the numbers - which was one purpose of the Auto-ID Center j.c.westlake@staffs.ac.uk

Issues? Database capacity • FEBRUARY 07, 2005 (COMPUTERWORLD) - ... data deluge that will soon rain down on IT. • radio frequency identification technology will arrive everywhere all too soon and that IT will be asked to manage the data that tracks "a product from raw materials to its final configuration." • RFID, which will balloon the amount of data that's generated and make indexing the information in a relational database prohibitively expensive and all but impossible. • estimates that if Wal-Mart Stores Inc. logged all of its inventory via RFID tags for a single day, it would reach 7 million terabytes of data. "Most of the data you'll never need," he says. "But when you need it, you need it right now." j.c.westlake@staffs.ac.uk

Database capacity - solution • Database developer ObjectStore • creates a solution that can collect and process in real time the vast amount of data produced by EPC networks in real time. • Cost? An enterprise solution so expensive • Object Store link j.c.westlake@staffs.ac.uk

Some aspects of Object Store • An in-memory database lets the RFID system write data to using in-memory constructs like a Java object graph—before finally committing that data to be stored on a disk. • This differs from a traditional database, which is designed to take very small, atomic changes to data and commit them, one by one. • The in-memory capability also enables data to be accessed and queried at a speed that far outpaces that of traditional databases and transactional systems, according to ObjectStore. • Event Engine captures, caches and queries data close to the edge of network. The goal is to keep and process the bulk of that data, eliminating frequent accessing of back-end databases and the huge amount of network traffic that would be required to send that data back to centralized querying applications. j.c.westlake@staffs.ac.uk

Database Capacity – solution? • DIY approach j.c.westlake@staffs.ac.uk

The role of the Internet? • Object Name Service (ONS) • Tells computer systems where to locate information on the Internet about any object carrying an EPC. • ONS is similar to the Internet’s existing Domain Name System (DNS) • which allows Internet routing computers to identify where the pages associated with a particular Web site are stored. • The ONS takes the EPC code and returns a Web address or Uniform Resource Locator (URL) where all information about that object resides. • Makes it possible to store large amounts of information on the Internet rather than on the object or object-label. j.c.westlake@staffs.ac.uk

Physical Markup Language (PML) • The Physical Markup Language (PML) is a new standard “language” for describing physical objects. • Together with the EPC and ONS, PML completes the fundamental components needed to automatically link information with physical products. • The EPC identifies the product; • the PML describes the product; • the ONS links them together. • Standardising these components will provide “universal connectivity” between objects in the physical world. j.c.westlake@staffs.ac.uk

Scanning process • Replicate the input that normally would be received from an RF reader • This will be done using XML • A schema based on the EPC format • A data document based on the schema • This input needs to be a format where is can be manipulated so that it can be compared/interrogated • This will be done with a database – MS Access • Design of database tables • Population of tables with data • Import XML document into database j.c.westlake@staffs.ac.uk

EPC format • Four constituent parts represent the code • The EPC code version • We will use 1.0 • The manufacturer • The product • The instance of a product • The serial number j.c.westlake@staffs.ac.uk

An XML schema • Reflects the EPC structure • We might wish to add date of scan • Schema would look like this j.c.westlake@staffs.ac.uk

j.c.westlake@staffs.ac.uk

Data <ScanData> <EPC1>01</EPC1> <EPC2>0000d04</EPC2> <EPC3>000012</EPC3> <EPC4>715248247</EPC4> <Datetime>2005-11-01T23:51:02</Datetime> </ScanData> <ScanData> <EPC1>01</EPC1> <EPC2>0000c02</EPC2> <EPC3>000013</EPC3> <EPC4>6713458</EPC4> <Datetime>2005-11-01T23:51:04</Datetime> </ScanData> j.c.westlake@staffs.ac.uk

Data • Each section between <ScanData> and </ScanData> represents one read from the reader read a tag • The data in XML format can then be transferred to a database e.g. Access j.c.westlake@staffs.ac.uk

The database design • What tables do we need? • A make table which correlates to the EPC2 – the manufacturer • An item category table which correlates to the EPC3 – the product • An item table which correlates to the EPC4 – the particular asset • It is necessary to think about the structure of each table and of course the relationships between the tables j.c.westlake@staffs.ac.uk

Tables and relationships • Support tables: • Manufacturer • A table which holds manufacturer/make details • Product • A table which holds product type details e.g. standard descriptions of equipment • Asset data table • The asset themselves based on an asset ID (serial number) j.c.westlake@staffs.ac.uk

Relationships j.c.westlake@staffs.ac.uk

Physical Markup Language (PML) • PML – what is it? • The Physical Markup Language (PML) is a new standard “language” for describing physical objects. • Version 1.0 available • Needs to be used with the EPC and ONS • PML completes the network components needed to automatically link information with physical products. • The EPC identifies the product • the PML describes the product • The ONS links them together. • Why bother? Standardizing these components will provide “universal connectivity” between objects in the physical world. j.c.westlake@staffs.ac.uk

PML • PML Core • Used to describe data directly generated by the Auto-ID infrastructure e.g.RFID readers • “Aggregation“ sensors • Temperature sensors • PML Extensions • Used to provide data • describing Auto-ID enabled objects e.g. Product related information/ • Process related information • Reference: Auto-Id Center j.c.westlake@staffs.ac.uk

PML Schemas • PML Core Specification 1.0 Available from the Auto-ID Center • Based on XML Schema release of the W3C • Uses XML Schema features to enforce datatypes and structure • Developed following the guidelines of ebXML • Component Specification (CCTS Version 1.8) • Section 6 (Appendix, pages 39-43) provides XML schemas for 'PmlCore.xsd' and 'Identifier.xsd‘ • PML definitions about EPC Network system related data a – vocabulary of sorts j.c.westlake@staffs.ac.uk

Available from Jonathan via email j.c.westlake@staffs.ac.uk

Data Collection/Data Capture