Ekstraksi Data pada Tabel dari Halaman Web Menggunakan Pohon Document Object Model
Abstract: Data on the web page
can be available in various formats, such as table. With the growing of web
pages, the need to extract data from tables is increasing. Results of the
extraction can be used for integration with other web tables or stored in a
database. This study discusses the extraction of data from a table on a web
page using a Document Object Model (DOM) tree. The initial step of this
extraction process is to transform the HTML document into a DOM tree. Then, by
applying search methods Depth First Search (DFS), part of the data in the table
is extracted and stored in a CSV file. An engine has been developed using
Visual Basic. The results show that the engine can automatically extract data
from the table that has the following characteristics: the number of rows and
columns are not limited, able to handle all of the table orientation layout,
and able to handle tables that are merged cells.
Penulis: Memen Akbar, Cici
Patmala, Dini Nurmalasari
Kode Jurnal: jptlisetrodd160426