Quantcast
Channel: Reading and writing to a file
Viewing all articles
Browse latest Browse all 6

Reading and writing to a file

$
0
0

Obviously StreamReader won't let you write...but although there is a StreamWriter it won't help you.

 

The thing is, what you're asking turns out to be problematic. And it's not due to a limitation in the .NET framework, it's because you've asked for a problematic thing.

 

There are two big problems here: 1) string length, and 2) encoding.

 

What if the 3rd string on the 2nd line is currently "Hello", and you want to change it to "Hello, world". You're replacing a 5 character string with am 11 character string. Bear in mind that text files are just a linear sequence of characters, one after another. If you want to replace a string with a longer string, then all the text that follows will have to be shuffled down to make space. The file systems on operating systems where .NET or other CLI implementations run (e.g. Windows, Mac OS X, Linux) don't support the ability insert extra bytes into the middle of an existing file.

 

So whatever your code looks like the end result is going to consist of writing out most of the file. (In theory, you only need to rewrite everything that follows the insertion point. In practice, this is a really bad idea. What if the power or network fails in the middle of the operation? You'll trash the file. So it's better to create a brand new modified version of the file, and then when you're done, delete the old file and put rename the new one with the name of the one you just deleted. Although if you're running on Vista, you could avoid this by using its transactional file system support. But that's not easy to use from .NET today.)

 

Encoding is also an issue. Suppose you decide to try to fix the line length by filling it with blank spaces up front. That way you can change strings in an existing line without having to rewrite the whole file. This will limit the maximum number of characters that fit on a line of course, but that might be an acceptable price for the performance benefits of not having to rewrite the whole file. (This is pretty much what databases do - that's one of the reasons you have to pick a string length for textual columns.) However, there's still a problem. StreamReader deals with text in the abstract. How do you want that represented in your concrete file? There are lots of text encodings. If you choose the popular UTF-8 standard you've got a problem because not all characters are the same length. A 5 character string might only need 5 bytes. But it might need 10 if you use a bunch of non-ASCII characters.

 

You could fix that by using UCS-16 - always 2 bytes per character. But the problem is StreamReader and StreamWriter abstract this away - they deliberately hide the encoding details so you don't have to deal with them. You just get to use Strings. Usually that's a good thing, but in this case, it means StreamReader/Writer don't support random access. Because some encodings are variable-length, random access sometimes simply isn't an option: the 100th character won't always be in the same place in the file - it depends on how much space was required to store the first 100 characters. It might be only 100 bytes, but it might be more.

 

Consequently, if you need random access, you need to work with bytes, not text. The FileStream class lets you do this. But now it's your problem to convert between streams of bytes and strings. (Something you were always going to have to do - you are asking to be able to manage byte offsets for strings in files.) So you'll also need to look up the Encoder family of classes.

 

What are you trying to do here? It might be that your whole strategy isn't the easiest way to do what you're doing. It sort of sounds like you're building your own mini-database, and that's always harder than people generally imagine.


Viewing all articles
Browse latest Browse all 6

Latest Images

Trending Articles





Latest Images