It’s funny that two posts so close together speak about flexibility with the LINQ to Twitter provider. There are certain things you know from experience on when to make software more flexible and when to save time. This is another one of those times when I got lucky and made the right choice up front. I’m talking about the ability to switch URLs.
It only makes sense that Twitter should begin versioning their API as it matures. In fact, most of the entire API has moved to the v1 URL at “https://api.twitter.com/1/”, except for search and trends. Recently, Twitter introduced the available and local trends, but hung them off the new v1, and left the rest of the trends API on the old URL. To implement this, I muscled my way into the expression tree during CreateRequestProcessor to figure out which trend I was dealing with; perhaps not elegant, but the code is in the right place and that’s what factories are for. Anyway, the point is that I wouldn’t have to do this kind of stuff (as much fun as it is), if Twitter would have more consistency. Having went to Chirp last week and seeing the evolution of the API, it looks like my wish is coming true. …now if they would just get their stuff together on the mess they made with geo-location and places… but again, that’s all transparent if your using LINQ to Twitter because I pulled all of that together in a consistent way so that you don’t have to.
Normally, when Twitter makes a change, code breaks and I have to scramble to get the fixes in-place. This time, in the case of a URL change, the adjustment is easy and no-one has to wait for me. Essentially, all you need to do is change the URL passed to the TwitterContext constructor. Here’s an example of instantiating a TwitterContext now:
using (var twitterCtx = new TwitterContext(auth, "https://api.twitter.com/1/", "https://search.twitter.com/"))
The third parameter constructor is the SearchUrl, which is used for Search and Trend APIs. You probably know what’s coming next; another constructor, but with the SearchUrl parameter set to the new URL as follows:
using (var twitterCtx = new TwitterContext(auth, "https://api.twitter.com/1/", "https://api.twitter.com/1/"))
One consequence of setting the URL this way is that you set the URL for both Trends and Search. Since Search is still using the old URL, this is going to break for Search queries. You could always instantiate a special TwitterContext instance for Search queries, with the old URL set. Alternatively, you can use the TwitterContext’s SearchUrl property. Here’s an example:
twitterCtx.SearchUrl = "https://api.twitter.com/1/";
var trends =
(from trend in twitterCtx.Trends
where trend.Type == TrendType.Daily &&
trend.Date == DateTime.Now.AddDays(-2).Date
select trend)
.ToList();
Notice how I set the SearchUrl property just-in-time for the query. This allows you to target the URL for each specific query. Whichever way you prefer to configure the URL, it’s your choice.
So, now you know how to set the URL to be used for Trend queries and how to prevent whacking your Search queries. I’ll be updating the Trend API to use same URL as all other APIs soon, so the only API left to use the SearchUrl will be Search, but for the short term, it’s Trends and Search. Until I make this change, you’ll have a viable work-around by setting the URL yourself, as explained above.
These were the Search and Trend URLs, but you might be curious about the second parameter of the TwitterContext constructor; that’s the URL for all other APIs (the BaseUrl), except for Trend and Search. Similarly, you can use the TwitterContext’s BaseUrl property to set the BaseUrl. Setting the BaseUrl can be useful when communicating with other services.
In addition to Twitter changing URLs, the Twitter API has been adopted by other companies, such as Identi.ca, Tumblr, and WordPress. This capability lets you use LINQ to Twitter with any of these services. This is a testament to the success of the Twitter API and it’s popularity. No doubt we’ll have hills and valleys to traverse as the Twitter API matures, but hopefully there will be enough flexibility in LINQ to Twitter to make these changes as transparent as possible for you.
@JoeMayo
One of the things that might be surprising in the LINQ Distinct standard query operator is that it doesn’t automatically work properly on custom classes. There are reasons for this, which I’ll explain shortly. The example I’ll use in this post focuses on pulling a unique list of names to load into a drop-down list. I’ll explain the sample application, show you typical first shot at Distinct, explain why it won’t work as you expect, and then demonstrate a solution to make Distinct work with any custom class.
The technologies I’m using are LINQ to Twitter, LINQ to Objects, Telerik Extensions for ASP.NET MVC, ASP.NET MVC 2, and Visual Studio 2010.
The function of the example program is to show a list of people that I follow. In Twitter API vernacular, these people are called “Friends”; though I’ve never met most of them in real life. This is part of the ubiquitous language of social networking, and Twitter in particular, so you’ll see my objects named accordingly. Where Distinct comes into play is because I want to have a drop-down list with the names of the friends appearing in the list. Some friends are quite verbose, which means I can’t just extract names from each tweet and populate the drop-down; otherwise, I would end up with many duplicate names. Therefore, Distinct is the appropriate operator to eliminate the extra entries from my friends who tend to be enthusiastic tweeters. The sample doesn’t do anything with the drop-down list and I leave that up to imagination for what it’s practical purpose could be; perhaps a filter for the list if I only want to see a certain person’s tweets or maybe a quick list that I plan to combine with a TextBox and Button to reply to a friend. When the program runs, you’ll need to authenticate with Twitter, because I’m using OAuth (DotNetOpenAuth), for authentication, and then you’ll see the drop-down list of names above the grid with the most recent tweets from friends. Here’s what the application looks like when it runs:

As you can see, there is a drop-down list above the grid. The drop-down list is where most of the focus of this article will be. There is some description of the code before we talk about the Distinct operator, but we’ll get there soon.
This is an ASP.NET MVC2 application, written with VS 2010. Here’s the View that produces this screen:
<%@ Page Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage<TwitterFriendsViewModel>" %>
<%@ Import Namespace="DistinctSelectList.Models" %>
<asp:Content ID="Content1" ContentPlaceHolderID="TitleContent" runat="server">
Home Page
</asp:Content><asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server">
<fieldset>
<legend>Twitter Friends</legend>
<div>
<%= Html.DropDownListFor(
twendVM => twendVM.FriendNames,
Model.FriendNames,
"<All Friends>") %>
</div>
<div>
<% Html.Telerik().Grid<TweetViewModel>(Model.Tweets)
.Name("TwitterFriendsGrid")
.Columns(cols =>
{
cols.Template(col =>
{ %>
<img src="<%= col.ImageUrl %>"
alt="<%= col.ScreenName %>" />
<% });
cols.Bound(col => col.ScreenName);
cols.Bound(col => col.Tweet);
})
.Render(); %>
</div>
</fieldset>
</asp:Content>
As shown above, the Grid is from Telerik’s Extensions for ASP.NET MVC. The first column is a template that renders the user’s Avatar from a URL provided by the Twitter query. Both the Grid and DropDownListFor display properties that are collections from a TwitterFriendsViewModel class, shown below:
using System.Collections.Generic;
using System.Web.Mvc;
namespace DistinctSelectList.Models
{
/// <summary>
/// For finding friend info on screen
/// </summary>
public class TwitterFriendsViewModel
{
/// <summary>
/// Display names of friends in drop-down list
/// </summary>
public List<SelectListItem> FriendNames { get; set; }
/// <summary>
/// Display tweets in grid
/// </summary>
public List<TweetViewModel> Tweets { get; set; }
}
}
I created the TwitterFreindsViewModel. The two Lists are what the View consumes to populate the DropDownListFor and Grid. Notice that FriendNames is a List of SelectListItem, which is an MVC class. Another custom class I created is the TweetViewModel (the type of the Tweets List), shown below:
namespace DistinctSelectList.Models
{
/// <summary>
/// Info on friend tweets
/// </summary>
public class TweetViewModel
{
/// <summary>
/// User's avatar
/// </summary>
public string ImageUrl { get; set; }
/// <summary>
/// User's Twitter name
/// </summary>
public string ScreenName { get; set; }
/// <summary>
/// Text containing user's tweet
/// </summary>
public string Tweet { get; set; }
}
}
The initial Twitter query returns much more information than we need for our purposes and this a special class for displaying info in the View. Now you know about the View and how it’s constructed. Let’s look at the controller next.
The controller for this demo performs authentication, data retrieval, data manipulation, and view selection. I’ll skip the description of the authentication because it’s a normal part of using OAuth with LINQ to Twitter. Instead, we’ll drill down and focus on the Distinct operator. However, I’ll show you the entire controller, below, so that you can see how it all fits together:
using System.Linq;
using System.Web.Mvc;
using DistinctSelectList.Models;
using LinqToTwitter;
namespace DistinctSelectList.Controllers
{
[HandleError]
public class HomeController : Controller
{
private MvcOAuthAuthorization auth;
private TwitterContext twitterCtx;
/// <summary>
/// Display a list of friends current tweets
/// </summary>
/// <returns></returns>
public ActionResult Index()
{
auth = new MvcOAuthAuthorization(
InMemoryTokenManager.Instance,
InMemoryTokenManager.AccessToken);
string accessToken = auth.CompleteAuthorize();
if (accessToken != null)
{
InMemoryTokenManager.AccessToken = accessToken;
}
if (auth.CachedCredentialsAvailable)
{
auth.SignOn();
}
else
{
return auth.BeginAuthorize();
}
twitterCtx = new TwitterContext(auth);
var friendTweets =
(from tweet in twitterCtx.Status
where tweet.Type == StatusType.Friends
select new TweetViewModel
{
ImageUrl = tweet.User.ProfileImageUrl,
ScreenName = tweet.User.Identifier.ScreenName,
Tweet = tweet.Text
})
.ToList();
var friendNames =
(from tweet in friendTweets
select new SelectListItem
{
Text = tweet.ScreenName,
Value = tweet.ScreenName
})
.Distinct(new SelectListItemComparer())
.ToList();
var twendsVM = new TwitterFriendsViewModel
{
Tweets = friendTweets,
FriendNames = friendNames
};
return View(twendsVM);
}
public ActionResult About()
{
return View();
}
}
}
The important part of the listing above are the LINQ to Twitter queries for friendTweets and friendNames. Both of these results are used in the subsequent population of the twendsVM instance that is passed to the view. Let’s dissect these two statements for clarification and focus on what is happening with Distinct.
The query for friendTweets gets a list of the 20 most recent tweets (as specified by the Twitter API for friend queries) and performs a projection into the custom TweetViewModel class, repeated below for your convenience:
var friendTweets =
(from tweet in twitterCtx.Status
where tweet.Type == StatusType.Friends
select new TweetViewModel
{
ImageUrl = tweet.User.ProfileImageUrl,
ScreenName = tweet.User.Identifier.ScreenName,
Tweet = tweet.Text
})
.ToList();
The LINQ to Twitter query above simplifies what we need to work with in the View and the reduces the amount of information we have to look at in subsequent queries. Given the friendTweets above, the next query performs another projection into an MVC SelectListItem, which is required for binding to the DropDownList. This brings us to the focus of this blog post, writing a correct query that uses the Distinct operator. The query below uses LINQ to Objects, querying the friendTweets collection to get friendNames:
var friendNames =
(from tweet in friendTweets
select new SelectListItem
{
Text = tweet.ScreenName,
Value = tweet.ScreenName
})
.Distinct()
.ToList();
The above implementation of Distinct seems normal, but it is deceptively incorrect. After running the query above, by executing the application, you’ll notice that the drop-down list contains many duplicates. This will send you back to the code scratching your head, but there’s a reason why this happens.
To understand the problem, we must examine how Distinct works in LINQ to Objects. Distinct has two overloads: one without parameters, as shown above, and another that takes a parameter of type IEqualityComparer<T>. In the case above, no parameters, Distinct will call EqualityComparer<T>.Default behind the scenes to make comparisons as it iterates through the list. You don’t have problems with the built-in types, such as string, int, DateTime, etc, because they all implement IEquatable<T>. However, many .NET Framework classes, such as SelectListItem, don’t implement IEquatable<T>. So, what happens is that EqualityComparer<T>.Default results in a call to Object.Equals, which performs reference equality on reference type objects. You don’t have this problem with value types because the default implementation of Object.Equals is bitwise equality. However, most of your projections that use Distinct are on classes, just like the SelectListItem used in this demo application. So, the reason why Distinct didn’t produce the results we wanted was because we used a type that doesn’t define its own equality and Distinct used the default reference equality. This resulted in all objects being included in the results because they are all separate instances in memory with unique references.
As you might have guessed, the solution to the problem is to use the second overload of Distinct that accepts an IEqualityComparer<T> instance. If you were projecting into your own custom type, you could make that type implement IEqualityComparer<T>, but SelectListItem belongs to the .NET Framework Class Library. Therefore, the solution is to create a custom type to implement IEqualityComparer<T>, as in the SelectListItemComparer class, shown below:
using System.Collections.Generic;
using System.Web.Mvc;
namespace DistinctSelectList.Models
{
public class SelectListItemComparer : EqualityComparer<SelectListItem>
{
public override bool Equals(SelectListItem x, SelectListItem y)
{
return x.Value.Equals(y.Value);
}
public override int GetHashCode(SelectListItem obj)
{
return obj.Value.GetHashCode();
}
}
}
The SelectListItemComparer class above doesn’t implement IEqualityComparer<SelectListItem>, but rather derives from EqualityComparer<SelectListItem>. Microsoft recommends this approach for consistency with the behavior of generic collection classes. However, if your custom type already derives from a base class, go ahead and implement IEqualityComparer<T>, which will still work.
EqualityComparer is an abstract class, that implements IEqualityComparer<T> with Equals and GetHashCode abstract methods. For the purposes of this application, the SelectListItem.Value property is sufficient to determine if two items are equal. Since SelectListItem.Value is type string, the code delegates equality to the string class. The code also delegates the GetHashCode operation to the string class.You might have other criteria in your own object and would need to define what it means for your object to be equal.
Now that we have an IEqualityComparer<SelectListItem>, let’s fix the problem. The code below modifies the query where we want distinct values:
var friendNames =
(from tweet in friendTweets
select new SelectListItem
{
Text = tweet.ScreenName,
Value = tweet.ScreenName
})
.Distinct(new SelectListItemComparer())
.ToList();
Notice how the code above passes a new instance of SelectListItemComparer as the parameter to the Distinct operator. Now, when you run the application, the drop-down list will behave as you expect, showing only a unique set of names.
In addition to Distinct, other LINQ Standard Query Operators have overloads that accept IEqualityComparer<T>’s, You can use the same techniques as shown here, with SelectListItemComparer, with those other operators as well. Now you know how to resolve problems with getting Distinct to work properly and also have a way to fix problems with other operators that require equality comparisons.
@JoeMayo
I almost tweeted a reply to Capar Kleijne's question about comments on Twitter, but realized that my opinion exceeded 140 characters. The following is based upon my experience with extremes and approaches that I find useful in code comments.
There are a couple extremes that I've seen and reasons why people go the distance in each approach. The most common extreme is no comments in the code at all. A few bad reasons why this happens is because a developer is in a hurry, sloppy, or is interested in job preservation. The unfortunate result is that the code is difficult to understand and hard to maintain. The drawbacks to no comments in code are a primary reason why teachers drill the need for commenting code into our heads. This viewpoint assumes the lack of comments are bad because the code is bad, but there is another reason for not commenting that is gaining more popularity.
I've heard/and read that code should be self documenting. Following this thought pattern, if code is well written with meaningful names, there should not be a reason for comments. An addendum to this argument is that comments are often neglected and get out-of-date, but the code is what is kept up-to-date. Presumably, if code contained very good naming, it would be easy to maintain. This is a noble perspective and I like the practice of meaningful naming of identifiers. However, I think it's also an extreme approach that doesn't cover important cases. i.e. If an identifier is named badly (subjective differences in opinion) or not changed appropriately during maintenance, then the badly named identifier is no more useful than a stale comment. These were the two no-comment extremes, so let's look at the too many comments extreme.
On a regular basis, I'll see cases where the code is over-commented; not nearly as often as the no-comment scenarios, but still prevalent. These are examples of where every single line in the code is commented. These comments make the code harder to read because they get in the way of the algorithm. In most cases, the comments parrot what each line of code does. If a developer understands the language, then most statements are immediately intuitive. i.e. what use is it to say that I'm assigning foo to bar when it's clear what the code is doing. I think that over-commenting code is a waste of time that slows down initial development and maintenance. Understandably, the developer's intentions are admirable because they've had it beaten into their heads that they must comment. However, I think it's an extreme and prefer a more moderate approach.
I don't think the extremes do justice to code because each can make maintenance harder. No comments on bad code is obviously a problem, but the other two extremes are subtle and require qualification to address properly. The problem I see with the code-as-documentation approach is that it doesn't lift the developer out of the algorithm to identify dependencies, intentions, and hacks. Any developer can read code and follow an algorithm, but they still need to know where it fits into the big picture of the application. Because of indirections with language features like interfaces, delegates, and virtual members, code can become complex. Occasionally, it's useful to point out a nuance or reason why a piece of code is there. i.e. If you've building an app that communicates via HTTP, you'll have certain headers to include for the endpoint, and it could be useful to point out why the code for setting those header values is there and how they affect the application. An argument against this could be that you should extract that code into a separate method with a meaningful name to describe the scenario. My problem with such an approach would be that your code base becomes even more difficult to navigate and work with because you have all of this extra code just to make the code more meaningful. My opinion is that a simple and well-stated comment stating the reasons and intention for the code is more natural and convenient to the initial developer and maintainer. I just don't agree with the approach of going out of the way to avoid making a comment. I'm also concerned that some developers would take this approach as an excuse to not comment their bad code. Another area where I like comments is on documentation comments. Java has it and so does C# and VB. It's convenient because we can build automated tools that extract these comments. These extracted comments are often much better than no documentation at all. The "go read the code" answer always doesn't fulfill the need for a quick summary of an API.
To summarize, I think that the extremes of no comments and too many comments are less than desirable approaches. I prefer documentation comments to explain each class and member (API level) and code comments as necessary to supplement well-written code.
Joe