Open In App

Large objects(LOBs) for Semi Structured and Unstructured Data

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Share
Report issue
Report

Large objects (LOBs) are a type of data type used to store semi-structured and unstructured data in a database. LOBs are typically used for storing data that is too large to fit into a traditional data type, such as text documents, images, videos, and audio files.

  1. LOBs are particularly useful for storing semi-structured and unstructured data, as these types of data often do not fit neatly into structured data types like strings or numbers. LOBs allow for more flexible storage and retrieval of data, as they can handle a wide range of data types and sizes.
  2. There are several different types of LOBs that are commonly used in databases, including:
  3. Binary large objects (BLOBs): Used for storing binary data, such as images, audio, and video files.
  4. Character large objects (CLOBs): Used for storing character data, such as text documents or HTML files.
  5. National character large objects (NCLOBs): Similar to CLOBs, but used for storing data in non-ASCII character sets.

LOBs can be particularly useful for storing data that is too large or too complex to fit into a traditional database schema. However, they can also present some challenges, particularly in terms of performance and scalability. LOBs can require significant processing and storage resources, and can be slower to retrieve than structured data types. As a result, they are typically used judiciously and with careful consideration of the specific use case and requirements.

Semi-structured data Semi-structured data is the data which does not conform to a data model but has some structure. It lacks a fixed or rigid schema. It is the data that does not reside in a rational database but that have some organisational properties that make it easier to analyse. With some process, we can store them in the relational database. Characteristics of semi-structured Data: 

  • Data does not conforms to a data model but has some structure.
  • Data can not be stored in the form of rows and columns as in Databases
  • Semi-structured data contains tags and elements (Metadata) which is used to group data and describe how the data is stored
  • Similar entities are grouped together and organised in a hierarchy
  • Entities in the same group may or may not have the same attributes or properties
  • Does not contains sufficient metadata which makes automation and management of data difficult
  • Size and type of the same attributes in a group may differ
  • Due to lack of a well defined structure, it can not used by computer programs easily

Using LOBs for Semi structured Data Document files such as XML documents or word processor files are examples of semi-structured data. These types of documents contain data in a logical structure that is interpreted or processed by an application, and it is not broken down into smaller logical units when stored in the database. Those applications which are having semi structured data typically use large amount of character data. For storing and manipulating this kind of data, Character Large Object (CLOB) and National Character Large Object (NCLOB) datatypes are available. Binary File objects (BFILE datatypes) can also used to store character data. BFILES can be also used to load read-only data from operating system into CLOB or NCLOB instances so that you can manipulate data in your application. 
Unstructured data Unstructured data is the data which does not conform to a data model and has no easily identifiable structure such that it can not be used by a computer program easily. Unstructured data is not organised in a pre-defined manner or does not have a pre-defined data model, thus it is not a good fit for a mainstream relational database. Characteristics of Unstructured Data: 

  • Data neither conforms to a data model nor has any structure.
  • Data can not be stored in the form of rows and columns as in Databases
  • Data does not follows any semantic or rules
  • Data lacks any particular format or sequence
  • Data has no easily identifiable structure
  • Due to lack of identifiable structure, it can not used by computer programs easily

Using LOBs for Unstructured Data Unstructured data cannot be broken into standard components. For example data of an employee can be separated/displayed as a name, which is stored as string; ID number, stored as an integer, the salary of employee & so on whereas on the other hand, A photograph consists of a long stream of 1s and 0s. These bits are manipulated to switch pixels as On & Off so that we can see the pictures on display, but they are not broken down into any structure for database storage. Also, unstructured data like graphics images, still video clips, motion videos and sound waveform tends to be large in size whereas a typical employee record may be equals to few hundred of bytes, while even small size of multimedia data can be equals to thousands of times larger. Ideal datatypes which are used for large amount of unstructured data includes BLOB datatype (Binary Large Object) and the BFILE datatype (Binary file object).


Last Updated : 22 Feb, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads