Fuse logo
Home
HTML to JSON Converter

HTML to JSON Converter

HTML to JSON Converter is used to convert HTML document to JSON by extracting the rows from HTML tables & converting it to JSON format. HTML is parsed, data types are automatically detected & converted to appropriate format in the JSON output. And finally the JSON output is formatted & indented for easy viewing.
For paid customers of Tool Slick: Make sure you login to ToolSlick before accessing the tool or else you will be redirected here.
Input
Output
Settings
Configure the settings for the conversion
History

HTML is the language of the internet. It is what creates HTML pages (even this one). In the old days, HTML used to be static with some JavaScript added into the mix for dynamic behavior and effects. Then HTML was served dynamically from the server side with the advent of server side programming languages such as PERL, PHP, ASP. And now there is a new trend where HTML is again being served as static resources with JSON (from REST web services) and JavaScript making it dynamic.

JavaScript Object Notation (JSON), pronounced as Jason, is the most common data interchange format on the web. Douglas Crockford first released the JSON specification in the early 2000s. It is a simple format that is easier to comprehend than XML. It is also smaller in size because it does not have closing tags. A wide variety of programming languages can parse JSON files. They also support the serialization of data structures to JSON. You can copy JSON text to JavaScript and start using them without any modifications.

Settings Explained
  1. Indent
    This setting governs whether or not the Output is indented. The indented Output is easier to comprehend. On the other hand, a non-indented output is compact. The smaller size is best for transmission over the network. So, we often minify JSON by removing non-essential whitespace
    • Indentation On
      {
        "name": "John Doe",
        "age": 69
      }
    • Indentation Off
      {"name":"John Doe","age":69}
  2. Unescape Json
    If selected and the input appears to be HTML wrapped in JSON string, the input is unescaped before processing.
  3. Mode
    Generic: In this mode all HTML nodes are converted into JSON objects & properties. Available options are Attribute Prefix & Text Property Name
    • Input
      <html>
      <body>
      <table style="width: 100%">
          <tr>
              <th>Firstname</th>
              <th>Lastname</th>
              <th>Age</th>
          </tr>
          <tr>
              <td>Jill</td>
              <td>Smith</td>
              <td>50</td>
          </tr>
          <tr>
              <td>Eve</td>
              <td>Jackson</td>
              <td>94</td>
          </tr>
      </table>
      </body>
      </html>
    • Output
      {
        "html": {
          "body": {
            "table": {
              "@style": "width: 100%",
              "tr": [
                {
                  "th": [
                    "Firstname",
                    "Lastname",
                    "Age"
                  ]
                },
                {
                  "td": [
                    "Jill",
                    "Smith",
                    "50"
                  ]
                },
                {
                  "td": [
                    "Eve",
                    "Jackson",
                    "94"
                  ]
                }
              ]
            }
          }
        }
      }
    Table: In this mode HTML <TABLE> nodes are converted into JSON objects & properties. Each <TR> is converted into a JSON object. The cells from the header row become JSON property names while the cells from other rows become the values of the JSON properties.
    • Input
      <html>
      <body>
      <table style="width: 100%">
          <tr>
              <th>Firstname</th>
              <th>Lastname</th>
              <th>Age</th>
          </tr>
          <tr>
              <td>Jill</td>
              <td>Smith</td>
              <td>50</td>
          </tr>
          <tr>
              <td>Eve</td>
              <td>Jackson</td>
              <td>94</td>
          </tr>
      </table>
      </body>
      </html>
    • Output
      [
        {
          "Firstname": "Jill",
          "Lastname": "Smith",
          "Age": 50
        },
        {
          "Firstname": "Eve",
          "Lastname": "Jackson",
          "Age": 94
        }
      ]
    JSON-LD: In this mode all JSON-LD is extracted from the HTML and outputed as JSON. Each JSON-LD item becomes an array item in the final output
    • Input
      <html>
      <body>
          <div class="row">
              <script type="application/ld+json">
                  {
                  "@context": "http://schema.org/",
                  "@type": "Person",
                  "name": "Jane Doe",
                  "jobTitle": "Professor",
                  "telephone": "(425) 123-4567",
                  "url": "http://www.janedoe.com"
                  }
              </script>
          </div>
          <div class="row">
              <script type="application/ld+json">
                  {
                  "@context": "http://schema.org/",
                  "@type": "Person",
                  "name": "John Doe",
                  "jobTitle": "Dancer",
                  "telephone": "(425) 123-4568",
                  "url": "http://www.johndoe.com"
                  }
              </script>
          </div>
      </body>
      </html>
    • Output
      [
        {
          "@context": "http://schema.org/",
          "@type": "Person",
          "name": "Jane Doe",
          "jobTitle": "Professor",
          "telephone": "(425) 123-4567",
          "url": "http://www.janedoe.com"
        },
        {
          "@context": "http://schema.org/",
          "@type": "Person",
          "name": "John Doe",
          "jobTitle": "Dancer",
          "telephone": "(425) 123-4568",
          "url": "http://www.johndoe.com"
        }
      ]
  4. Attribute Prefix
    The prefix to use for properties corresponding to HTML attributes. Set blank to use no prefix
    • Input
      <html>
      <body>
      <table style="width: 100%">
          <tr>
              <th>Firstname</th>
              <th>Lastname</th>
              <th>Age</th>
          </tr>
          <tr>
              <td>Jill</td>
              <td>Smith</td>
              <td>50</td>
          </tr>
          <tr>
              <td>Eve</td>
              <td>Jackson</td>
              <td>94</td>
          </tr>
      </table>
      </body>
      </html>
    • Attribute Prefix: @
      {
        "html": {
          "body": {
            "table": {
              "@style": "width: 100%",
              "tr": [
                {
                  "th": [
                    "Firstname",
                    "Lastname",
                    "Age"
                  ]
                },
                {
                  "td": [
                    "Jill",
                    "Smith",
                    "50"
                  ]
                },
                {
                  "td": [
                    "Eve",
                    "Jackson",
                    "94"
                  ]
                }
              ]
            }
          }
        }
      }
    • Attribute Prefix: Empty
      {
        "html": {
          "body": {
            "table": {
              "style": "width: 100%",
              "tr": [
                {
                  "th": [
                    "Firstname",
                    "Lastname",
                    "Age"
                  ]
                },
                {
                  "td": [
                    "Jill",
                    "Smith",
                    "50"
                  ]
                },
                {
                  "td": [
                    "Eve",
                    "Jackson",
                    "94"
                  ]
                }
              ]
            }
          }
        }
      }
  5. Text Property Name
    The name of the property that holds the value of HTML text nodes
    • Input
      <html>
      <body>
      <div>
          <p>
              Pre Header
              <h1>Title</h1>
              Post Header
          </p>
      </div>
      </body>
      </html>
    • Text Property Name: #text
      {
        "html": {
          "body": {
            "div": {
              "p": {
                "h1": "Title",
                "#text": [
                  "Pre Header",
                  "Post Header"
                ]
              }
            }
          }
        }
      }
    • Text Property Name: text
      {
        "html": {
          "body": {
            "div": {
              "p": {
                "h1": "Title",
                "text": [
                  "Pre Header",
                  "Post Header"
                ]
              }
            }
          }
        }
      }