php - HTML Dom Parse 5000 + Items -
is there anyways run/write code below efficiently?
when run (via chrome browser) tends time-out around 500th item , redirect me homepage.
<?php include_once('config.php'); include_once('simple_html_dom.php'); for($i = 0; $i <= 5000; ++$i){ // retrieve dom given url $html = file_get_html($url); // loop checks through page contents , retrieves required foreach($html->find('div.product-details-contents') $content) { $detail['productid'] = $i; $detail['title'] = $content->find('span.title', 0)->plaintext; $detail['unit'] = $content->find('span.unit-size', 0)->plaintext; $sqlstring = implode("','", $detail); $sql = "insert `cdidlist` (`productid`, `title`, `unit`) values ('$sqlstring')"; if (!mysqli_query($connect, $sql)) { echo "error: " . mysqli_error(); } echo $id . " " . $detail['title'] . " item added sucsessfully! <br>"; } } ?>
you opening 5000 webpages , parsing them. can't done efficiently. keep script die'ing, can use set_time_limit(600) inside of loop, making sure have appropriately high timeout in php.ini well.
edit: don't own server. means going have push off client side. go this:
php:
if(isset($_request['i'])) { $i = (int) $_request['i']; // sanitize input $error_message = false; /* load page, parse page , input db. if there error, save $error_message */ if(!$error_message) { die(json_encode('ok')); // die'ing bad, one-off script } else { die(json_encode($error_message)); } }
in html:
<p id="status">status</p> <script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script> <script type="text/javascript"> $(function () { 'use strict'; var = function (i) { if (i > 5000) { $('#status).html('complete'); } else { $.get({ url: window.location.href, data: {i: i}, success: function (data) { if(data === 'ok'){ $('#status').html('fetched ' + i); get(i + 1); } else { $('#status').html('error fetching ' + + ': ' + data); } } }); } }; get(0); }); </script>
edit 2: others have mentioned, vulnerable sql injection. please see pdo , pdostatement prepared statements.
Comments
Post a Comment