Using content types

Note:  For this discussion, we specifically avoid the use of the word file when talking about content. The runtime content engine does not assume that content is contained in a file in the file system. However, it does include protocol that allows content types to be associated with file-naming patterns. In practice, these file names represent files in the file system, but nothing in the implementation of the content system assumes that the content is located in the file system. File encoding and content types discusses the file-oriented content type capabilities contributed by the platform resources plug-in and is a must-read for developers interested in using the content type API in that context.

Finding out about content types

Content types are represented by IContentType. This interface represents a unique content type that knows how to read a data stream and interpret content type-specific information. Content types are hierarchical in nature. For example, a content type for XML data is considered a child of the text content type. This allows new content types to leverage the attributes or behavior of more general content types.

The IContentTypeManager is the entry point that gives access to most of the content type related API provided by the platform runtime. To obtain a reference to the platform IContentTypeManager, clients can use the Platform API:

IContentTypeManager contentTypeManager = Platform.getContentTypeManager();

Clients can use the platform IContentTypeManager to find out about the content types in the system.

Detecting the content type for a data stream

Given a stream of bytes, it is possible to determine its content type by calling the IContentTypeManager API as follows:

InputStream stream = ...;
IContentType contentType = contentTypeManager.findContentTypeFor(stream, "file.xml");

This will return the most appropriate IContentType given the input provided, or null if none can be found. Multiple content types might be deemed appropriate for a given data stream. In that case, the platform uses some heuristics to determine which one should be selected. The file name is the first criterion by which content types are selected. It can be omitted, but this has two issues: the results might not be as correct because many unrelated content types might accept the same input; there is also a big performance hit, since all content types in the platform have to be given a chance of analysing the stream. So, unless it is not available, clients should always provide a file name along with the stream.

Describing a data stream

Another interesting feature of the content type support in the platform is the ability of describing the contents of a binary or character stream. The following code snippet shows how to do that:

InputStream stream = ...; 
IContentDescription description = contentTypeManager.getDescriptionFor(stream, "file.xml");

The returned IContentDescription instance describes the content type and additional relevant information extracted from the contents provided. The content description stores content-specific properties in form of key/value pairs. The platform itself is able to describe properties such as the character set and the byte order of text-based streams, but others can be defined by content type providers.

Providing content-sensitive features

New content types are often defined as specialization of existing ones. This hierarchy establishes a "is a" relationship between a derived content type and its base type. Plug-in developers must honor this when implementing content sensitive features. If a given feature is applicable to a given content type, the feature must be applicable to any derived content types as well. The IContentType.isKindOf(IContentType superType) method allows determining whether two IContentTypes are related. The method IContentType.getBaseType() allows determining the base type of a given IContentType.