Posts
20
Comments
67
Trackbacks
0
Semaphores for better distribution of work items across threads

 

As developers we will often find ourselves doing loops/iterations to process hundreds, thousands or even millions of  items, e.g. parsing files, screen scraping sites, doing complex computations on multiple rows etc.

To process these types of jobs, we can take a look on some supermarket lines, where we have 1 long line with several counters catering to 1 customer at a time, when a cashier is done with the current customer, one customer will be called from the line. 

On the programming world we have what we call threading and semaphores and luckily .Net Offers built-in classes for these concepts Smile .

Threads allows us to do calculations on a separate thread while leaving the original/main thread busy on the GUI.

https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)

Sempaphores allows us to process limited number of items within our list(collection of jobs that needs to be processed) by flagging them when a thread is already available

https://en.wikipedia.org/wiki/Semaphore_(programming)

 

Below is the resulting output from the code below

image

 

Text files will be generated on the bin directory

image

 

Below is a short code that connects to multiple sites and saves the HTML code that is returned.

using System;
using System.Net;
using System.Threading;

namespace Semaphores
{
    class Program
    {
        private static Semaphore sem;
        private static int workCount;
        private static int totalJobs;
        private static int maxJobs = 3;
        public static void Main()
        {
            string[] webSites= new string[100];
            sem = new Semaphore(0, maxJobs);

            webSites[0] = "https://blog.codinghorror.com";
            webSites[1] = "https://msdn.microsoft.com/en-us/default.aspx";
            webSites[1] = "http://www.asp.net";
            webSites[2] = "http://odetocode.com";
            webSites[3] = "http://irisclasson.com";
            webSites[4] = "http://blog.davidebbo.com";
            webSites[5] = "https://damianedwards.wordpress.com";
            webSites[6] = "http://davidfowl.com";
            webSites[7] = "http://weblogs.asp.net/jongalloway";
            webSites[8] = "https://damieng.com";
            webSites[9] = "http://weblogs.asp.net/scottgu";
            webSites[10] = "http://haacked.com";

            totalJobs = 0;  

            //send all works to the queue
            for (int i = 0; i <= 10; i++)
            {
                Thread thread = new Thread(new ParameterizedThreadStart(Worker));
                thread.Start(webSites[i]);
            }       
               
            Thread.Sleep(500);           

            //Release initial semaphores based on maxJobs
            sem.Release(maxJobs);           
           
            Console.ReadLine();
        }

        private delegate void ProcessDelegate(string site);

        private static void Worker(object site)
        {           
            Console.WriteLine("{0} added to queue and waits for a signal from the semaphore.",  site.ToString());

            //waits for a go signal from the semaphore
            sem.WaitOne();
           
            //Start to work
            webWorker(site.ToString());
            sem.Release();
        }

        private static void webWorker(string site)
        {
            WebClient wc = new WebClient();
            string responseData = "";
            string outFile = "";
            try
            {
                Console.WriteLine("Started to connect to " + site.ToString());

                //Start to connect to website and get HTML Contents
                responseData = wc.DownloadString(site);
              
                //Clean up strings to generate a filename for the output file
                outFile = site.Replace("http://", "");
                outFile = outFile.Replace("https://", "");
                outFile = outFile.Replace("/", "");

                //Save Html Content
                System.IO.StreamWriter sw = new System.IO.StreamWriter(outFile + ".txt",true);
               
                sw.WriteLine(responseData);
                workCount += 1;

                //Delegate to count jobs being processed
                ProcessDelegate PD = countItems;
               
            }
            catch (Exception)
            {
                throw;
            }
            finally
            {
                countItems(site);
            }
        }

        private static void countItems(string  site)
        {
            totalJobs += 1;

            Console.WriteLine("Last Job Processed ==>  " + site);
            Console.WriteLine("Total Jobs Processed ==>  " + totalJobs.ToString());
           
            if (workCount >= 11)
            {
                Console.WriteLine("Done with all the jobs . . . . ");
            }
        }

    }
}

 

Please feel free to commend and suggest . . .

posted on Tuesday, September 13, 2016 7:20 PM Print
Comments
Gravatar
# re: Semaphores for better distribution of work items across threads
LindaBPolk
9/19/2016 1:55 PM
Very interesting topis,I really liked it.Sempaphores allows us to process limited number of items within our list by flagging them when a thread is already available.There are lot of benefit in multi programing
Sempaphores allows us to process limited number of items within our list(collection of jobs that needs to be processed) by flagging them when a thread is already available.
Sempaphores allows us to process limited number of items within our list(collection of jobs that needs to be processed) by flagging them when a thread is already available
http://american-writers.org/
Gravatar
# re: Semaphores for better distribution of work items across threads
justanotherdevguy
10/24/2016 1:38 AM
i think you can use Parallel for this scenario.

Post Comment

Title *
Name *
Email
Comment *  
Verification
Tag Cloud