{"id":3280,"date":"2024-09-26T09:49:43","date_gmt":"2024-09-26T17:49:43","guid":{"rendered":"http:\/\/pididu.com\/wordpress\/?p=3280"},"modified":"2024-09-26T10:00:21","modified_gmt":"2024-09-26T18:00:21","slug":"convert-json-database-file-to-excel","status":"publish","type":"post","link":"http:\/\/pididu.com\/wordpress\/blog\/convert-json-database-file-to-excel\/","title":{"rendered":"Convert JSON Database File to Excel"},"content":{"rendered":"\n<p>This python script is a better solution than my <a href=\"http:\/\/pididu.com\/wordpress\/blog\/convert-json-file-to-excel-without-paying\/\">previous post<\/a>.  It was generated with a prompt to  Microsoft&#8217;s version of chatGPT, something like &#8220;Write a python program to translate a unicode JSON file of arbitrary size into comma-separated values.&#8221;   I did a little manual fix-up on the script that was generated, and got:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#!\/usr\/bin\/python3\nimport json\nimport csv\n\nwith open('answers.json') as f:\n    data = json.load(f)\n\ncsv_file = open('answers.csv', 'w')\n\ncsv_writer = csv.writer(csv_file)\n\ncount = 0\n\nfor item in data:\n    if count == 0:\n        header = item.keys()\n        csv_writer.writerow(header)\n        count += 1\n\n    csv_writer.writerow(item.values())\n\ncsv_file.close()\n<\/code><\/pre>\n\n\n\n<p>The first line in the script is just to tell Linux to use python3. If you have Windows, that isn&#8217;t necessary.<\/p>\n\n\n\n<p>My input file was answers.json, and output was answers.csv . You would change these to suit, or if you have a lot of files, maybe want to prompt for the names.<\/p>\n\n\n\n<p>The program assumes that the first record of the json is a header with column names.<\/p>\n\n\n\n<p>Unlike the previous conversion website that I used, multi-byte unicode characters like emojis were handled well, for instance:<\/p>\n\n\n\n<p>Q: \ud83d\ude02\u201e are you a rePUPlican \ud83d\udc36 or a demoCAT \ud83d\udc31?<\/p>\n\n\n\n<p>A: indePANDAnt \ud83d\udc3c<\/p>\n\n\n\n<p>The output file is comma-separated values (csv), which can easily be read by Excel, or Openoffice Calc.<\/p>\n\n\n\n<p>One problem was that my Yahoo Answers database originally came from the web (html), so contained strings like<\/p>\n\n\n\n<p><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">I can&amp;#39;t believe it &amp;lt;3<\/mark><\/code><\/p>\n\n\n\n<p>which should be<\/p>\n\n\n\n<p><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">I can't believe it &lt;3<\/mark><\/code><\/p>\n\n\n\n<p>As the ampersand-hash sequence was just ordinary characters in the JSON, it passed right through that way, and I had to do a global search and replace on all such things. I just did a search on <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">&amp;<\/mark><\/code> in all cells, then when I found a pattern like <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">can&amp;#39;t<\/mark><\/code> did a global replace of <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">&amp;#39;<\/mark><\/code> with <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">'<\/mark><\/code> (apostrophe).  Not all of the finds were of the form <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">&amp;# <\/mark><\/code>&#8211; there were a few like <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">&amp;quot;<\/mark><\/code> and <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-red-color\">&amp;amp;<\/mark><\/code> .<\/p>\n\n\n\n<p>There&#8217;s probably some Excel script that does the translation of such things, but it didn&#8217;t take long to do it manually.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This python script is a better solution than my previous post. It was generated with a prompt to Microsoft&#8217;s version of chatGPT, something like &#8220;Write a python program to translate a unicode JSON file of arbitrary size into comma-separated values.&#8221; I did a little manual fix-up on the script that was generated, and got: The &hellip; <a href=\"http:\/\/pididu.com\/wordpress\/blog\/convert-json-database-file-to-excel\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Convert JSON Database File to Excel<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[166],"tags":[360,390,391,358,361,359],"_links":{"self":[{"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/posts\/3280"}],"collection":[{"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/comments?post=3280"}],"version-history":[{"count":0,"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/posts\/3280\/revisions"}],"wp:attachment":[{"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/media?parent=3280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/categories?post=3280"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/pididu.com\/wordpress\/wp-json\/wp\/v2\/tags?post=3280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}