{"id":302,"date":"2020-06-15T17:01:10","date_gmt":"2020-06-15T08:01:10","guid":{"rendered":"https:\/\/blog.wsd.sh\/?p=302"},"modified":"2020-06-16T16:40:44","modified_gmt":"2020-06-16T07:40:44","slug":"nodejs-book-chapter-10","status":"publish","type":"post","link":"https:\/\/blog.wsd.sh\/?p=302","title":{"rendered":"Nodejs Book: Chapter 10"},"content":{"rendered":"<p>In the last chapter, we created a form parser. But it&#8217;s still incomplete as being<br \/>\nable to upload files is one of the advantages of using forms. So in this chapter<br \/>\nwe will break down how a file is different from normal form data, and then devise<br \/>\nan approach to handle that information.<\/p>\n<p>Using the raw form text program we edit our chapter 8 example to include a file<br \/>\ntype and upload a very small image to see how information is encoded for a normal<br \/>\nform field when compared to a file.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_12.jpg\" alt=\"\" class=\"alignnone size-full wp-image-303\" width=\"1140\" height=\"886\" srcset=\"https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_12.jpg 1140w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_12-300x233.jpg 300w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_12-1024x796.jpg 1024w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_12-768x597.jpg 768w\" sizes=\"(max-width: 1140px) 100vw, 1140px\" \/><\/p>\n<p>The output should look like the above image. A few differences are easily<br \/>\nobservable. For a normal field type we have &#8220;Content-Disposition: form-data;&#8221;<br \/>\nprepended onto the front, followed by the name of the field, and then two line<br \/>\nbreaks before the value of the given field.<\/p>\n<p>In the case of a file, the prefix is the same with Content-Disposition being<br \/>\ndeclared followed by the name of the field. From there we see something quite<br \/>\ndifferent. The name is followed by a filename that wasn&#8217;t present before, and<br \/>\nwe have MIME type &#8220;Content-Type: image\/png&#8221; sent from the client.<\/p>\n<p>Following that we have two line break before some unusual characters. Those<br \/>\ncharacters are the utf8 string representation of the binary image we sent<br \/>\nto the server. In this case the information of a single pixel image.<\/p>\n<p>Given the structure we have an usual situation where we have to work with the<br \/>\nsame information as both text and binary data. We have to determine the form data<br \/>\nboundary to split the form data into different parts. From there we have to parse<br \/>\nthe text from the header to retrieve information for the given fields. And the<br \/>\nfinally following two line breaks, we have to get the binary data from that offset<br \/>\nuntil the end of the current boundary.<\/p>\n<p>We will simply edit our &#8220;api_form&#8221; function from the previous chapter to create<br \/>\na more generalized form which is able to handle both standard form fields and<br \/>\nform files. The source code is as given below.<\/p>\n<pre>function api_form(req, res) {\n\n\tvar raw_data = [];\n\tvar raw_length = 0;\n\n\treq.on(\"data\", function(data) {\n\n\t\traw_data.push(data);\n\t\traw_length += data.length;\n\n\t});\n\n\treq.on(\"end\", function() {\n\n\t\tvar boundary, i, buf, file, line;\n\t\tvar buffer = raw_data.concat(raw_data, raw_length);\n\t\tvar ofs = [];\n\n\t\tlet form = {};\n\t\tlet files = [];\n\n\t\tfor (i = 0; i &lt; buffer.length; i++) {\n\n\t\t\tif (buffer[i] !== 10) {\n\t\t\t\tcontinue;\n\t\t\t}\n\n\t\t\tboundary = buffer.toString(\"ascii\", 0, i - 1);\n\t\t\tconsole.log(boundary);\n\t\t\tbreak;\n\n\t\t}\n\n\t\ti = 0;\n\t\twhile (i !== -1) {\n\t\t\tofs.push(i);\n\t\t\ti = buffer.indexOf(boundary, i + 1, \"ascii\");\n\t\t}\n\n\t\tfor (i = 0; i &lt; ofs.length - 1; i++) {\n\n\t\t\tbuf = buffer.slice(ofs[i], ofs[i + 1]);\n\t\t\tboundary = buf.indexOf(\"\\r\\n\\r\\n\", 0, \"ascii\");\n\n\t\t\tlet header = buf.slice(0, boundary).toString(\"ascii\");\n\t\t\tlet key = header.match(\/name=\"(.*?)\"\/)[1];\n\t\t\tlet filename = header.match(\/filename=\"(.*?)\"\/);\n\n\t\t\tfile = buf.slice(boundary + 4);\n\n\t\t\tif (!filename) {\n\n\t\t\t\tform[key] = file.toString(utf8);\n\n\t\t\t} else {\n\n\t\t\t\tfiles.push({\n\t\t\t\t\t\"name\": filename[1],\n\t\t\t\t\t\"key\": key,\n\t\t\t\t\t\"data\": file\n\t\t\t\t});\n\n\t\t\t}\n\n\t\t}\n\n\t\tvar response = [];\n\n\t\tasync.eachSeries(files, function(file, nextFile) {\n\n\t\t\tresponse.push(\"\/img\/\" + file.name);\n\t\t\tfs.writeFile(\"public\/img\/\" + file.name, file.data, function(err) {\n\t\t\t\tif (err) {\n\t\t\t\t\tthrow err;\n\t\t\t\t}\n\n\t\t\t\tnextFile();\n\t\t\t});\n\n\t\t}, function() {\n\n\t\t\tres.writeHead(200, {\n\t\t\t\t\"Content-Type\": \"text\/plain\"\n\t\t\t});\n\t\t\tres.end(JSON.stringify(response));\n\n\t\t});\n\n\t});\n}\n<\/pre>\n<p>The code is not the most elegant approach, but it works, so let&#8217;s go over it.<\/p>\n<pre>function api_form(req, res) {\n\n\tvar raw_data = [];\n\tvar raw_length = 0;\n\n\treq.on(\"data\", function(data) {\n\n\t\traw_data.push(data);\n\t\traw_length += data.length;\n\n\t});\n<\/pre>\n<p>As data comes in bursts of buffers, we create an array where we store each buffer<br \/>\nas it comes, and store the length of the total raw buffer.<\/p>\n<pre>req.on(\"end\", function() {\n\n    var boundary, i, buf, file, line;\n    var buffer = raw_data.concat(raw_data, raw_length);\n    var ofs = [];\n\n    let form = {};\n    let files = [];\n\n    for (i = 0; i &lt; buffer.length; i++) {\n\n        if (buffer[i] !== 10) {\n            continue;\n        }\n\n        boundary = buffer.toString(\"ascii\", 0, i - 1);\n        console.log(boundary);\n        break;\n\n    }\n<\/pre>\n<p>Once the client has completed sending the information to the server, we concatenate<br \/>\nall of our partial buffers into one complete buffer. The next step is to file<br \/>\nthe first line break, as the first character up to that point will be the<br \/>\nform boundary string. We store this string into a variable so we can separate out<br \/>\nthe boundaries.<\/p>\n<pre>i = 0;\nwhile (i !== -1) {\n    ofs.push(i);\n    i = buffer.indexOf(boundary, i + 1, \"ascii\");\n}\n<\/pre>\n<p>From there we iterate over the entire file. We start from 0, and search for the<br \/>\nboundary string. This will give us a separation where the boundary will be at<br \/>\nthe top of the given section with the data ending at the next offset.<\/p>\n<pre>for (i = 0; i &lt; ofs.length - 1; i++) {\n\n    buf = buffer.slice(ofs[i], ofs[i + 1]);\n    boundary = buf.indexOf(\"\\r\\n\\r\\n\", 0, \"ascii\");\n\n    let header = buf.slice(0, boundary).toString(\"ascii\");\n    let key = header.match(\/name=\"(.*?)\"\/)[1];\n    let filename = header.match(\/filename=\"(.*?)\"\/);\n\n    file = buf.slice(boundary + 4);\n\n    if (!filename) {\n\n        form[key] = file.toString(utf8);\n\n    } else {\n\n        files.push({\n            \"name\": filename[1],\n            \"key\": key,\n            \"data\": file\n        });\n\n    }\n\n}\n<\/pre>\n<p>It&#8217;s from there where we have to sort out server conditions. First slice the<br \/>\ntotal raw buffer to focus on a single field segment. From there we can assume<br \/>\nthat two line breaks defines the break between the field header and the field<br \/>\ndata. We locate the end of the header index and convert it to a string to extract<br \/>\nthat information.<\/p>\n<p>From there we use a regular expression to extract the field name and attempt to<br \/>\nextract a filename, if it exists. The rest of the data we can assume is the data<br \/>\nassociated with the given field from the client. If no filename exists, then we<br \/>\ncan assume it&#8217;s a standard form input and we store the name string a key to a form<br \/>\nobject. Otherwise if there is a filename, then we store the filename, the original<br \/>\nname associated with the form and the binary data.<\/p>\n<pre>var response = [];\n\nasync.eachSeries(files, function(file, nextFile) {\n\n    response.push(\"\/img\/\" + file.name);\n    fs.writeFile(\"public\/img\/\" + file.name, file.data, function(err) {\n        if (err) {\n            throw err;\n        }\n\n        nextFile();\n    });\n\n}, function() {\n\n    res.writeHead(200, {\n        \"Content-Type\": \"text\/plain\"\n    });\n    res.end(JSON.stringify(response));\n\n});\n<\/pre>\n<p>From there we iterate over the files and write them to the &#8220;`img&#8220;` directory<br \/>\nin our public folder for debugging and add each of the file url&#8217;s to a list which<br \/>\nwe return to the client.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_13.jpg\" alt=\"\" class=\"alignnone size-full wp-image-304\" width=\"1140\" height=\"886\" srcset=\"https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_13.jpg 1140w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_13-300x233.jpg 300w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_13-1024x796.jpg 1024w, https:\/\/blog.wsd.sh\/wp-content\/uploads\/2020\/06\/fig_13-768x597.jpg 768w\" sizes=\"(max-width: 1140px) 100vw, 1140px\" \/><\/p>\n<p>The html file for this chapter is given below.<\/p>\n<p>File: form.html<\/p>\n<pre>&lt;!DOCTYPE HTML&gt;\n&lt;html&gt;\n\n    &lt;head&gt;\n\n        &lt;meta charset=\"utf-8\"\/&gt;\n        &lt;title&gt;Ajax Request&lt;\/title&gt;\n\n    &lt;\/head&gt;\n\n    &lt;body&gt;\n\n        &lt;form id=\"exampleForm\"&gt;\n            &lt;table&gt;\n                &lt;tr&gt;\n                    &lt;td&gt;Profile Image:&lt;\/td&gt;\n                    &lt;td&gt;&lt;input type=\"file\" name=\"profile_img\" multiple\/&gt;&lt;\/td&gt;\n                &lt;\/tr&gt;\n                &lt;tr&gt;\n                    &lt;td&gt;Submit:&lt;\/td&gt;\n                    &lt;td&gt;&lt;input type=\"submit\" value=\"Submit\"\/&gt;&lt;\/td&gt;\n                &lt;\/tr&gt;\n            &lt;\/table&gt;\n        &lt;\/form&gt;\n\n        &lt;br&gt;\n\n        &lt;pre id=\"responseText\"&gt;&lt;\/pre&gt;\n\n        &lt;script type=\"text\/javascript\" src=\"js\/form.js\"&gt;&lt;\/script&gt;\n\n    &lt;\/body&gt;\n\n&lt;\/html&gt;\n<\/pre>\n<p>File: js\/form.js<\/p>\n<pre>\"use strict\";\n\nvar exampleForm = document.getElementById(\"exampleForm\");\nvar responseText = document.getElementById(\"responseText\");\n\nexampleForm.addEventListener(\"submit\", function (event) {\n\n    event.preventDefault();\n\n    var formData = new FormData(exampleForm);\n\n    var xml = new XMLHttpRequest();\n    xml.open(\"POST\", \"\/api\/form\", true);\n    xml.send(formData);\n\n    xml.onload = function() {\n\n        responseText.textContent = xml.responseText;\n\n    }\n});\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>In this article we go through the process of parsing the form data of from data that we&#8217;ve uploaded to the server. When it comes to parsing, we need to find the text pattern used for separating boundaries and read everything in between to extract the files.<\/p>\n","protected":false},"author":1,"featured_media":249,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false},"categories":[4],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/posts\/302"}],"collection":[{"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=302"}],"version-history":[{"count":3,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/posts\/302\/revisions"}],"predecessor-version":[{"id":307,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/posts\/302\/revisions\/307"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=\/wp\/v2\/media\/249"}],"wp:attachment":[{"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=302"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=302"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.wsd.sh\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=302"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}