c# - Intermittent aborted connections, when Web Scrapping using WebClient -


sometimes error, , data site

error:

wyjątek nieobsłużony: system.net.webexception: Żądanie zostało przerwane: połączenie zostało nieoczekiwanie zakończone.        w system.net.connectstream.read(byte[] buffer, int32 offset, int32 size)        w system.io.streamreader.readbuffer()        w system.io.streamreader.readline()        w consoleapplication3.download.geturldata() w c:\users\user\documents\visual     studio 2013\projects\consoleapplication3\consoleapplication3\program.cs:wiersz 4     1        w consoleapplication3.program.main(string[] args) w c:\users\user\documents\v     isual studio 2013\projects\consoleapplication3\consoleapplication3\program.cs:wi     ersz 55 

english translation: request aborted: connection unexpectedly terminated

my code:

 public string geturldata()         {             webclient client = new webclient();             random r = new random();             //random ip address             client.headers["x-forwarded-for"] = r.next(0, 255) + "." + r.next(0, 255) + "." + r.next(0, 255) + "." + r.next(0, 255);             //random user-agent             client.headers["user-agent"] = "mozilla/" + r.next(3, 5) + ".0 (windows nt " + r.next(3, 5) + "." + r.next(0, 2) + "; rv:2.0.1) gecko/20100101 firefox/" + r.next(3, 5) + "." + r.next(0, 5) + "." + r.next(0, 5);             stream datastream = client.openread(url);             streamreader reader = new streamreader(datastream);             stringbuilder sb = new stringbuilder();             while (!reader.endofstream)                 sb.append(reader.readline());             return sb.tostring();         } 

and main:

var d = new download("http://wiocha.pl");              var str = d.geturldata();              console.writeline(str); 

what data every time without error?

limit on max conncurrent clients

i know limitation httpwebrequest: can have 1 or 2 active client requests @ time. not familiar webclient way, must actively dispose of httpwebresponse object when using httpwebrequest way. therefore, suggest first try disposing datastream before returning string.

not programming

it looks web scrapping, , cutting off. need to:

  1. limit how send requests, and;
  2. vary source ip using other machines/networks different ips and/or use of vpns.

another improvement

also, there's easier way input stream single string: var responsestring = reader.readtoend();. wouldn't need stringbuilder or while loop.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - .htaccess mod_rewrite for dynamic url which has domain names -

Website Login Issue developed in magento -