Getting the HTML of a Page

You can easily get the html of a particular page by using the WebClient class which lies in the System.Net namespace. In the code below I get the html source of the yahoo.com and saves it in a text file.

private void Button1_Click(object sender, System.EventArgs e)
        {
            WebClient webClient = 
new WebClient(); 
            
const string strUrl = "http://www.yahoo.com/"; 
            
byte[] reqHTML; 
            reqHTML = webClient.DownloadData(strUrl); 
            UTF8Encoding objUTF8 = 
new UTF8Encoding(); 
            
string pageHTML = objUTF8.GetString(reqHTML);
 
            
// You can also write the whole HTML to a file 
            
            
string path = @"C:\TEMP\testFile.txt"; 
            FileStream fs = File.Create(path); 
            StreamWriter sw = 
new StreamWriter(fs); 
            sw.Write(pageHTML); 
            fs.Close();
            fs.Close();        
        }

 

powered by IMHO

Print | posted @ Friday, September 16, 2005 8:54 PM

Twitter