Join me at my new blog at MSN Spaces. Thanks geekswithblogs for the solid hosting; I plan to leave these posts up here until Spaces can import them in some magical future.
You might be wondering why some classes in the .NET Framework have properties, other classes have GetXXX methods, and some have a combination of the two. The reason is that the semantic, or meaning, is quite different.
Consider this class:
public class Customer
{
public int CustomerID;
public string FirstName;
public string LastName;
public Order[] GetOrders()
{
// do database work
}
}
The CustomerID, FirstName, and LastName represent the customer state. The orders are derived from the customer state using GetOrders(). In fact I do not support having method like this because it presumes that a canonical representation of Order exists. That's another topic.
A great rule-of-thumb regarding properties is to consider that the Visual Studio debugger executes the getter method when watching an object. That is, property accessors are executed at unpredictable times and should thus not cause any discernable side-effects. Abusing properties can lead to heisenbugs.
Another reason to use a method is that order retrieval is likely to be parameterized.
Finally, a good practice is for property access to be computationally cheap; client code should not be forced to place the property value into a local variable - it's premature optimization. Expensive code should be placed in methods - client code is sure to store and reuse the result. The exception is if the property getter method caches the result in a private field, which can be a challenge as the class instance is mutated.
The property syntax in .NET languages is not syntactic sugar. It adds considerable richness to classes.
You can convert local datetime values to UTC datetime values, and vice-versa, using the built-in GETUTCDATE() function:
DECLARE @LocalDate DATETIME
SET @LocalDate = GETDATE()
-- convert local date to utc date
DECLARE @UTCDate DATETIME
SET @UTCDate = DATEADD(Hour, DATEDIFF(Hour, GETUTCDATE(), GETDATE()), @LocalDate)
-- convert utc date to local date
DECLARE @LocalDate2 DATETIME
SET @LocalDate2 = DATEADD(Hour, DATEDIFF(Hour, GETDATE(), GETUTCDATE()), @UTCDate)
SELECT @LocalDate, @UTCDate, @LocalDate2
Note that GETUTCDATE() returns the current datetime in UTC. By comparing the value with GETDATE() we can determine the time zone, which can then be used to adjust any date.
I tried to bake these expressions into a set of user-defined functions, but SQL Server complained because user-defined functions cannot call non-deterministic functions (in this case GETDATE()/GETUTCDATE()).
Comments within a C# method body tend to A) make a problem statement then B) document the steps to solve the problem. The problem statements tend to be multi-line; the details of the solution tend to be single-line.
I want the problem statements to stand out. Traditionally I use #region blocks to group the problems. Unfortunately, regions are single-line. They are also slightly more laborous to construct.
Lately I have been using a triple-slash comment block at the top of the method body to state the problem:
/// <summary>
/// Occurs when the page loads.
/// </summary>
private void Page_Load(object sender, System.EventArgs e)
{
/// Configure the http response to return a correct set of
/// cache headers. Validation headers must also be returned.
// set the etag
this.Response.Cache.SetETag("055aba934a3c21:380");
// set the cache policy
this.Response.Cache.SetCacheability(HttpCacheability.Public);
}
Another reason to use the triple-slash comment block within a method body, as opposed to the <remarks> block, is that it does not form a part of the API documentation. That is, the client-supplier relationship plays a role here. It is desirable to separate interface documentation from implementation documentation - they are intended for two different audiences. For a given method the external documentation should be placed in a <remarks> block above the method declaration; the internal documentation should be placed in triple-comment blocks at the top of the method body. It will naturally stand out from the implementation specifics.
This approach works better than using /* */-style comments blocks because such comments cannot be nested.
Incidentally, VS.NET has a handy button on the Text Editor toolbar to comment out the selected lines. It adds double-slash comments at the extreme left side. A complementary button exists that will uncomment the selected lines. It seems that left-justified comments are reserved for commenting-out code, not documentation.
I am struck by the consequences of the experience Craig relates in this post:
http://staff.develop.com/candera/weblog2/CommentView.aspx?guid=54fd1ed8-ea12-4555-9ee3-4f9e3d57907e
He is discussing a Wiki site that mysteriously had its page content rolled back to a previous version. Wiki pages generally contain a link to do this, and it is assumed that a human would follow it with caution. Turns out that GoogleBot crawled the site and what-do-you-know the content rolled back. The underlying problem was that a GET request caused a significant side effect, in this case altering the content of the linking page.
The HTTP specification recommends that GET requests are idempotent, which means that a given action is performed once even if it invoked numerous times. The intent is that GETs are “safe and repeatable”. That is, it should be safe for a crawler to invoke any GET request (subject to authorization of course).
This bears on the question of how to best use ASP.NET controls, notably the data controls where you might attempt to optimize ViewState by eliminating postbacks and changing buttons to hyperlinks. Note that the LinkButton is safe because it causes a postback. You can also probably get away with using a form with method=GET because a spider is unlikely to submit a form (even one based on GET), but I don't recommend it.
I traditionally replace postbacks with links to optimize ViewState, as opposed to “simply” disabling ViewState, because a race condition exists if you do not have a DataKeys collection that is consistent before and after the postback. The race is that the resultset might have changed in the meantime, and the action would thus be carried out on the wrong data key. That is, you would be operating purely by ordinal. So, with this idempotence revelation I am back to square one with optimizing DataGrids.
When catching an exception in C# you need not declare a variable for the exception instance if you do not need it. This will avoid the “unused variable” warning.
try { /* ... */ }
catch(Exception) // catches without assigning to a variable
{
// ...
}
Here is how to rethrow an exception without affecting the stack trace:
try { /* ... */ }
catch(System.Data.SqlClient.SqlException ex)
{
if(ex.Number==2627) { /* special case for primary key violation */ }
else throw; // rethrows without affecting the stack trace
// else throw ex; // rethrows with the stack trace reset to this line
}
Edamno has released a VB.NET library for creating various shell extensions
- Browser Helper Objects
- Context Menu Handlers
- IE Menu Buttons
- InfoTips
- Property Sheets
- Thumbnails
http://www.mvps.org/emorcillo/dotnet/shell/shellextensions.shtml
He also has a library for working with Windows Task Scheduler, and, most importantly, a library for working with OLE Structured Storage. Now you can read and write the author and comments associated with a file!
It is a known hard problem to match nested parenthesis pairs using regular expressions. Put another way, regular expressions to not typically support counting occurrences. .NET has a little-known RegEx construct for doing just that called the “balancing group definition“:
Balancing group definition. Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets; for example, (?'name1-name2').
I found great information on using the balancing group definition here:
http://www.oreilly.com/catalog/regex2/chapter/index.html
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpcongroupingconstructs.asp?frame=true
In my opinion this construct greatly enhances the utility of regular expressions.
If you are using IHttpHandlers to serve dynamic content, or generating a large directory structure to be served statically, here is some information you should know.
We use a custom IHttpHandler to dynamically serve (and cache) images from our database. For simplicity, the handler uses Request.PhysicalPath as the ultimate cache location. That is, a url “~/cache/123/1-Large.image“ is mapped to “<path to application root>\cache\123\1-Large.jpg“ by the handler. If the file exists, it is served, otherwise it is read from the database, cached to the local file, then served. The '123' directory is created on-demand.
The main advantage of using Request.PhysicalPath is that no configuration parameter is needed to indicate the cache base path. Also it is highly performant on a cache hit.
Recently we ran a utility program called Handle against our production web servers. Handle emits a list of open file/directory handles. We noticed that each directory involved in a request to the image handler had an open file handle! That is, '“<path to application root>\cache\123' is open.
The reason is that ASP.NET sets up a FileMonitor for each directory, to monitor for web.config files should one be created.
We simply stopped using Request.PhysicalPath and turned to mapping the request path to a private cache directory. Since the monitor is only set up for existing directories, so it is safe to use a handler in this way, so long as the underlying 'natural' directory structure is not created.
I have heard Don Box speak about the caviats of using the Indigo service model. He states that service classes should not use transport-specific API - notably the use of HttpContext. The logic is obvious.
The problem is that my ASMX web services must often use XSD files that are in the same directory as the ASMX file. HttpContext.Current.Server.MapPath is used to resolve to a local path. What is the recommended API for doing such resolution without HttpContext? Don, your comments would be appreciated.
Have you wondered how the XSD tool and the WSDL tool in the .NET Framework can target numerous managed languages? Internally, it uses a great technology called CodeDOM. CodeDOM is a set of classes in the .NET Framework that facilitate code generation. Here is a sample of CodeDOM in action, generating an Order class.
using System;
using System.CodeDom;
using System.Reflection;
public class CodeGenerator {
public void GenerateOrderType() {
// declare an "Order" type
CodeTypeDeclaration decl = new CodeTypeDeclaration("Order");
// implement ICloneable
decl.BaseTypes.Add(typeof(ICloneable));
// add a field
decl.Members.Add(new CodeMemberField(typeof(int), "OrderID"));
...
}
}
The XSD tool builds up a CodeDOM compile unit describing the classes to be generated, then instantiates a CodeDOM provider to generate source code in a specific language. Providers exist for many .NET Framework languages. The tool supports a command-line switch to specify the desired CodeDOM provider.
By writing a custom CodeDOM provider, we can augment the generated classes with great features - replace fields with properties, replace arrays with strong collections, implement IClonable, generate IComparer instances, and more!
At Point2 I wrote an extensive library to do just that. We recently shared-sourced it here: http://www.gotdotnet.com/Community/Workspaces/Workspace.aspx?id=80258a8c-bb4c-48e6-948b-05f6da568f55
Enjoy!
I am so sick of seeing this incorrect usage of try..finally:
Stream s = null;
try {
s = new FileStream(...);
...
}
finally {
if(s!=null) s.Close();
}
This is WRONG. The work done in the finally block need only be done if the stream is opened. Until the assignment of 's' completes, the finally block is not to be executed. The 'if' statement is basically checking that fact. The correct usage is:
Stream s = new FileStream(...);
try {
...
}
finally {
s.Close();
}
No excuses! Programmers are lucky that the 'if' check can even be done - the anti-pattern doesn't work in general. Consider this scenario involving thread synchronization:
Monitor.Enter();
try {
...
}
finally {
Monitor.Exit();
}
To write the above using the anti-pattern, one would need to be able to check that the monitor was entered, which may or may not be possible. Without a check, the anti-pattern might exit a monitor that was not entered.
With respect to exception handling, to catch an exception for Enter through to Exit the correct pattern is:
try {
Monitor.Enter();
try {
...
}
finally {
Monitor.Exit();
}
} catch(Exception ex) { ... }
To handle an exception on Enter separately from the rest, use:
try {
Monitor.Enter();
}
catch(Exception ex) { ... }
try {
...
}
catch(Exception ex) { ... }
finally {
Monitor.Exit();
}
Microsoft programmers use this anti-pattern all the time. Stop it!
I just realized that partial classes will encourage AOP. A small example of this is with respect to import statements. A class might have a dozen import statements. Each import relates to specific functionality of the class. If those functions are largely independent, you can now separate them into multiple partial classes, even within the same source file. Consequently, the import statements are closely associated with the implementation.
For example, this code would appear within one file. Obviously, only one class is being defined.
namespace Sample.Something
{
using System.Web;
using System.Web.Services;
[WebService]
public partial class FooService : System.Web.Services.WebService
{
[WebMethod] public Foo[] Search() { ... }
}
}
namespace Sample.Something
{
using System.Xml;
using System.Xml.Serialization;
public partial class FooService
{
internal void SerializeToXml(XmlWriter writer) { ... }
}
}
namespace Sample.Something
{
using System.IO;
public partial class FooService
{
internal void WriteLog(Stream s) { ... }
}
}
Since seeing MR I have been puzzled over the ending. Last night, I realized something that has really helped. Agent Smith is deleted. It comes down to purpose. Agent Smith's purpose is to kill The One. We are told many times that once a program's purpose is fulfilled it is deleted. Neo sacrifices himself to fulfill Agent Smith's purpose, and thus make Smith elegible for deletion. Upon absorbing Neo, Smith is confused and wonders aloud, “is it over?”. Neo/Smith nods with a big grin. The machines then issue the command and announce “it is done”. This explains why only Neo and the machines together can kill Smith.
Earlier, Smith philosophizes with Neo, saying “the purpose of life is to die”. That is, the purpose in life is to identify and fullfill your purpose. By doing so, you will die.
Only at the end of MR is Neo truly The One. Why? To become The One you must free yourself from all systems of control. In the end Neo simply refuses to die, to “have an end”. Smith: “Why, why get up? Why keep fighting? [...] Is it freedom [...]?” Neo: “Because I choose to“. Neo has attained ultimate free will. For that, the machines honor him by raising his body to the sky. Morpheus (M1): “Free your mind“. Note that Smith finally calls Neo by the name “Neo” (not “Mr. Anderson“) when he says “Everything that has a beginning has an end, Neo.” Does this mean that the Oracle was dominating him at that moment, or does it mean that Smith has accepted that Neo is The One? If the latter, then it explains why Neo must survive until that moment. Smith immediately says “What did I just say?”. That too can be taken either way. Could mean that he didn't hear himself, or that he is reacting to his own word choice.
Hello, another thread compelled me to spell out what my supposed state-of-the-art thinking is on database object names.
Sql Server has no concept of a namespace for object names. That is, all names for tables and views (and more) must be unique. Well, there is in fact a single namespacing concept - the owner!
We create owners for the logical namespaces in our product. The views and (primarily) computed tables that belong to that namespace will be owned by the appropriate sql account. In that way, intra-namespace calls between relations need not be qualified, but inter-namespace calls must be.
Also, the same relation name can be used by various subsystems, as they implement their views. This avoids the ugly problem of assigning arbitrary suffixes to view names. For example, consider possible view names over some table "dbo.Equipment" - "dbo.EquipmentSummary", "dbo.EquipmentView", "dbo.EquipmentDetails". OUCH. I prefer “[MyProduct.Admin].Equipment“, “[MyProduct.Search].Equipment“.
The application does not login to sql using these accounts - yet. In time, we might do this and finally be able to take advantage of sql server's security mechanisms.
It makes sense in a twisted way, hey? The namespaces "own" the relations! Of course, triggers and procedures are also treated in this way. It's helpful when a certain base table has many orthogonal triggers that those triggers are owned by the subsystem that requires them.
Note that, within Enterprise Manager, you will sort by Owner, not Name, to get a nice subsystem-centric view of your db objects.
We are early into the implementation of this approach. Please give me your feedback/concerns, thanks!
Regarding PDC, I had a blast! I am pumped about Whidbey, Yukon, Longhorn…everything. Great to be back with my family, Tonya and Ava. Oh and another in beta!
The Yukon caching stuff demands a renewed search for “indexed view”-friendly queries. That is, to benefit from Yukon’s cache invalidation stuff, you must follow the constraints of indexed views. Until PDC it was a mystery to me as to how cache invalidation was implemented. In retrospect it is obvious that indexed views solve the essential invalidation problem. Yukon surfaces the invalidation event rather than updating an index. I am presuming that indexed views are not recalculated from scratch when an underlying table changes.
Yukon's mechanisms are not finalized. Mike Pizzo says that they will enhance the technology, either by:
-
automatically simplifying complex queries to invalidation-friendly ones.
-
accepting a set of secondary, invalidation-friendly, queries such that you can setup a more pessimistic invalidation.
Either mechanism will introduce pessimism. He says that additional features will impact whether the callback can actually return the inserted/deleted rows - a deadly feature that is under evaluation.
Back to the point, I want my queries to be compatible with invalidation and indexed views. I want to enumerate ways to achieve this:
-
Use triggers to maintain precomputed tables. Generate these tables to allow your queries to be friendly.
-
Similarly, denormalize to other database on a schedule (wishful)
Please reply with ideas for effective use of indexed views. Thanks!
?>
Rob Howard presented the expanded support for caching in ASP.NET. Two main features were presented:
CacheDependency is now extensible. By overriding a small set of virtual methods, you can implement any cache dependency scheme. For example you can call a web service to check the validity of some data. You can use your custom CacheDependency instance to control output caching of a page:
Dim myDependency As New MyCacheDependency(...)
Response.AddCacheDependency(myDependency)
The other main feature is that Whidbey will support database cache dependencies. There is a separate implementation for Sql Server 7/2000 and for Yukon.
- Sql Server 7 & 2000 will support table-level dependency tracking. That is, for a given query, you declare the set of tables to watch. If data is changed in any of those tables, the cache entry will expire. Obviously, this is extremely pessimistic and thus is only suitable for tables that change infrequently. Also, a light-weight polling mechanism is used: when data changes, a trigger writes to a table with one row per table that is being watched. On a configurable interval, ASP.NET will poll the table and invalidate entries appropriately.
- Yukon will support row-level dependency tracking. It works in roughly the same way as indexed views. Think about it - when a table changes, the corresponding indexed views are updated. The same analysis that provides that functionality also provides the cache invalidation feature. The consequence is that the query in question is subject to the same constraints as indexed views. If a query is not supported, an invalidation notification is fired immediately. Exciting! Note that the client-side plumbing is handled by ADO.NET.
Database-driven cache invalidation is a real gift to the world!.
See my post, http://geekswithblogs.net/ewright/posts/309.aspx, for information on achieving table-level invalidation in ASP.NET 1.0!
Did you know that
you can achieve table-level cache invalidation with ASP.NET today? Here's how:
The stock ASP.NET 1.0 CacheDependency class can monitor a local or network file. As the file changes, the appropriate cache entries are evicted from the cache. We can leverage this to achieve near-realtime cache invalidation as database data changes.
- Write a trigger for each table you want to monitor. The trigger will touch a file on the database's filesystem. The filename will correspond to the table name. This can be achieved by invoking the FileSystemObject scripting object.
CREATE PROCEDURE TouchFile(@FileName varchar(255)) AS
DECLARE @FS int, @OLEResult int, @FileID int
EXECUTE @OLEResult = sp_OACreate 'Scripting.FileSystemObject', @FS OUT
IF @OLEResult <> 0 RAISERROR ('could not create FileSystemObject',16,1)
-- touch the file
execute @OLEResult = sp_OAMethod @FS, 'OpenTextFile', @FileID OUT, @FileName, 2, 1
EXECUTE @OLEResult = sp_OADestroy @FileID
EXECUTE @OLEResult = sp_OADestroy @FS
- Share the relevant folder in the database's filesystem.
- Setup CacheDependency instances to watch the appropriate files for changes. Use methods on the Response object to interoperate with page-level output caching. Use the raw caching API for arbitrary data caching.
This technique offers a number of advantages over the proposed polling mechanism in Whidbey (see earlier post).
- The invalidation will occur more quickly than with a polling mechanism.
- No polling is used - the caching API uses the FileSystemMonitor component, that in turn leverages a callback mechanism to detect filesystem changes.
Other notables about this approach:
- Like the Whidbey mechanisms, it is web garden and web cluster friendly.
- An extremely narrow race condition might exist. It occurs when two or more triggers attempt to touch the same file. One will succeed, others will fail. Will this result in stale data? Given the asynchronous nature of FileSystemMonitor, it is extremely unlikely. Nonetheless, you might want to bat around the scenarios in your head.
- If the file cannot be touched, the procedure will nonetheless complete. That is, the race does not defeat the transaction and is safe.
- This is a hack.
I hammered the SQLXML guys at the PDC today. I want to know whether SQLXML is regarded (rightly) as awesome progressive technology, or merely a step towards either ObjectSpaces or (worse) XML datatypes in Yukon. Some points:
- SQLXML is not deprecated.
- It has been ported to fully managed code for Whidbey timeframe.
- In terms of the internal implementation, it will not depend on FOR XML syntax in the future, but rather on multiple active resultsets and client-side post-processing. Seems ass-backwards to me - they need deeper integration with the relational engine to avoid any denormalization whatsoever.
- In future releases you will not annotate your schema with SQLXML attributes. Instead, you will write a separate mapping file. BOO. Well, OK, it's no big deal. If you want to keep your annotations within the schema, use an XSL transformation to generate the mapping file. I think of annotations like custom attributes in managed code. It's a great innovation that they are inline, rather than in a separate file.
- ObjectSpaces and SQLXML use the same engine. I might be misunderstanding - the statement was interleaved with another conversation.
No matter what the implementation details are, the great thing about SQLXML is that from the client point of view, you get a normalized, heirarchical snapshot of your data in one round-trip to the server. It eliminates ugly SQL code in your app - you know, the stuff that either executes a query to fetch the details for each master record (aaah!) or re-normalizes with conditional logic. Plus, as the SQLXML implementation improves, you get wholesale speed improvements with no code or database changes!
I think SQLXML is a first step in a revolution in database data interchange. I like to compare it to the success of XML itself. XML is nothing but a scheme for encoding structured data in a text file. It has thoroughly replaced the Comma-Separated-Values (CSV) format because it can capture heirarchical data structures. Well, the rowset is today the primary format (so to speak) for transferring structured data from a database to a client. By replacing the rowset with an XML stream, we can enjoy the same benefits as with the transition away from CSV to XML. We can now transfer heirarchical (master-detail) data in its natural, normalized state. There are also performance advantages associated with doing so, not the least of which is increased network efficiency and fewer round-trips.
Don't even get me started on the righteousness of defining your DAO using XML Schema (XSD), which forms the center of the SQLXML approach. You can use schema to generate classes, and thus achieve complete Object-Xml-Relational mapping. Note that the Xml itself becomes an implementation detail. That's right, SQLXML provides an excellent foundation for Object-Relational mapping. That is the reason why we regard Microsoft's ObjectSpaces as a competitor to SQLXML. My colleague is particularly concerned:
http://weblogs.asp.net/jcollins/posts/33991.aspx
Let me mention the power and utility of XSLT. If your data is in XML (in this case, a schema definition) you are in a great position. They want a different vocabulary? Fine - transform! XSLT files are succinct, flexible, and resilient (that is, quite stable over time), and can emit non-XML output, like SQL statements. Virtually every project I work on involving XML also involves XSLT.
SQLXML is my favorite SQL Server technology. You wanna talk about it? I'd love to! Mail or reply to this post.
Found an interesting paragraph in an iisanswers article. Have not verified this information.
http://www.iisanswers.com/articles/IIS51.htm
Connection Limits
XP Pro allows 10 connections. This limit is installed by default in the metabase key MaxConnections for W3SVC, and there is no user interface method for modifying the setting. You can change this setting to any number less than 40 and it works, but that is not widely advertised.
The ability to increase the connection limit is interesting because you can now increase the max number of simultaneous connections used by IE, on your developer box.
http://www.regxplor.com/tweak11.html
Wondering how ASP.NET page caching relates to IIS6 kernel-mode caching?
- With output caching enabled for a Web Form, the page will be served directly from the Windows Server 2003 kernel if VaryByParam=“None“ and no other Vary settings are specified. The framework will not be called in any way - Application_BeginRequest will not fire.
- If VaryByParam, VaryByControl, or VaryByCustom is used, then Application_BeginRequest and Application_EndRequest will fire and the kernel cache will not be used.
- According to informal tests using ACT, the kernel cache doubles the performance of a cache hit.
Refer to http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpconoutputcache.asp
How are POSTs handled? These rules are independent of kernel caching.
- The request params (whether query string or POST params) are not part of the cache key. That is, unless you use VaryByParam or VaryByControl, the same cache entry will be returned no matter what the query params are.
- The cache key is the request path and the http method. That is, a postback button on the page will cause a cache miss on the first click, but not on subsequent clicks.
It is typically desirable to suppress caching on a POST. To do this, use this code in Application_BeginRequest or in Page_Load - Postbacks will be treated as cache misses.
if(HttpContext.Current.Request.HttpMethod=="POST") this.Response.Cache.SetNoServerCaching();
Alright! I've been frozen for 30 years but now thawed and ready to blog!
What is karmatron dynamics? The very science of human destiny!