[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: DOSEMU/DOSBOX question



At 09:45 PM 1/22/08 -0500, Paul Lagasse wrote:
>If only understanding it would allow me to divine which file was
>which.
Well, my take is that is would be a fairly small undertaking to create a small DOS command which, given a long filename as its argument, would produce the DOSEMU mangled name. Do you think it would it be useful to try to produce such a module?

>re

>b) with DOSEMU, scanning for a dot is from the left, not from
>the right ...

>I don't see evidence of a misattribution of the extension in a
>mangled ...
Yup, you're right. I should have, but didn't, look up the C function "strrchr". It finds the last occurrence, not the first (of "."), so the mangling does preserve the "last" extension of the original filename. Thanks for the test, sorry for the error.

>(BTW, in Xy I can also see the hidden Linux backup for "howdy
>etc," ie "howdy de do.htm.dos.txt~" -- which displays as
>HOWDY~3I.TXT. If I open this in Xy and edit and save it, the
>Linux file "howdy...txt~" no longer exists, and I get two new
>files (whose Linux names are howdy~3i.txt and howdy~3i.bak).
Oh, oh. I'm very sorry to hear that. That alone may make DOSEMU virtually unusable for anything other than 8.3 lower case names.
I've always wondered about a similar issue in Windows. Consider the
following, under Windows:
If I create a file called README.TXT with XY (using the NEW command), it's
windows name is README.TXT. (There is an added confusion factor in that
some programs, such as Explorer in Win NT and Win 98 at least, will display
README.TXT as Readme.txt. But that is a behavior of those programs, not the
file system itself, and I will ignore that, other than to caution that this
might introduce additional confusion into our discussion if not
understood.) Now, I can and do use File Manager (or Explorer) to rename
that to readme.txt (because I prefer lower case names). Now, I can still
open that with XY, because windows filenames are case insensitive. But,
suppose I then save it again, using XY. XY only deals in uppercase names,
so it will effectively REQUEST that the file README.TXT be saved. Yet, in
all of my experiences to date, the saved file remains "readme.txt" under
Windows. Now, why is that?
Well, firstly, there are two ways that XyWrite might work, when it goes to
save an updated existing file. It might first erase the old file, and then
create a new one, which is a perfectly logical thing to do. Or it might
open the existing file for writing, truncate it to length zero, and then
append the newly updated contents to the 0 length file. As it turns out,
there is a "create file" system call in DOS which automatically opens and
does a truncation of an existing file if one exists, or creates a new file
if one does not.
With original DOS, it didn't make a lot of difference which sequence a
program used. (It did make some difference, because in the erase,
create-new scenario, the dir entry for file can move to a different
directory slot, and subsequently appear in a different place in an unsorted
DIR listing.) But with Windows, it makes a big difference, because the
erase will erase "readme.txt", and the subsequent create will create
"README.TXT". But the "open-existing, truncate, and append" sequence
preserves *some* existence of the original "readme.txt" file entity over
the entire process, so the name is never touched and remains "readme.txt".
I've been somewhat surprised under DOS that NO editor that I have found
ends up causing "readme.txt" to become "README.TXT" when I re-save an
existing file. That *could* be true because they ALL use the create-file
call, rather than an erase, create-new sequence. Or it is also possible
that the windows designers anticipated this problem, and actually designed
the file system code so that creating a "README.TXT" immediately after
erasing "readme.txt" was a special activity recognized by the file system
code, wherein windows took it upon itself to create the new file as
"readme.txt". I do not know this to be the case, however, and I rather
doubt it.
Also, note that if an editor has "save a backup of the original when
saving" logic that is active, the logic might change to "rename the
original to .bak," then "create new." This would also cause the updated
file to again acquire the RENAME.TXT name.
Anyway, the same kind of questions exist with DOSEMU. You've asked XY to
save HOWDY~3I.TXT. If XY does an erase first, the search for HOWDY~3I.TXT
against the mangled set of names in the directory will produce a match at
"howdy de do.htm.dos.txt~", and that file will be erased. The subsequent
creation of HOWDY~3I.TXT will then create HOWDY~3I.TXT, because there is
now no existing file that matches HOWDY~3I.TXT, even with mangling. For the
file to be saved as "howdy de do.htm.dos.txt~", the entire process has to
use the "open, truncate, append" kind of logic, which preserves the
original file's continued existence through time, even though it's contents
are to be completely updated.
As it turns out, Unix (and I presume Linux) seems to have no create-file
system call, to match the DOS create-file call. But it does have a
"truncate to zero length" flag that can be set on the open-file system
call, and I think that this makes unix/linux open-file equivalent to DOS
create-file when this flag is set.
So, from your observation, I think that we now know that the "open,
truncate, append" logic is not occurring on the Linux side. That could be
because XY used "erase, create-new" logic for the save, or it could be
because XY used "rename-to-.bak, create-new" logic for the save because of
a "save a backup of the original" options was active in XY, or it could
mean that DOSEMU mapped the DOS create-file system call into an erase-file,
open-new sequence, since Linux doesn't have a create-file system call (and
they didn't realize that using open-file with the truncate-to-zero flag set
would be a better match).
So, some questions would be: (1) which XY were/are you using, and (2)
did/do you have any "save a copy of the original" options turned on in XY?
(I guess XY has such options, but I never have used them, and I've
forgotten.) If so, you might want to try a similar experiment with "save
original" turned off, and see if that changes the behavior.
Anyway, I think I need to look at the source a bit more and see if I can
get a better reading as to what I think the code is doing.

>re

>c) DOSEMU adds an extension of "___" (three underscores) if the
>original filename had no extension.

>I think, from what I've seen, that it only adds these
>underscores if the file is a hidden file that begins with a
>period, such as ".dosemurc".
Yup, I was wrong again. But, the actual logic, I think, adds the suffix only if the *last* period in the filename is in the first position of the filename, so I don't believe that the "___" extension will get added for a filename like ".abc." (and, boy, just thinking about this makes it clear how hard it is to really anticipate the name mangling in detail using a black box approach, without looking at the code).

>re

>>8. ...So, the bottom line, as I understand it, is that any file
>>you create from DOS will therefore be 8.3, but lower case, as
>>it actually is created within Linux. Similarly, any Linux file
>>that is 8.3, but other than all lower case, will be
>>unaccessible to old DOS apps, period.

>I think that this last item is only a problem when there is a
>conflict between two files with the same name, save for
>differences in capitalization. The README.TXT I created using
>Gedit can be seen and (once I delete readme.txt and Readme.txt)
>opened in Xy.
Okay, looks like I'm wrong again, and thanks for that. I have no explanation for that, given what I understand of the code so far, to I will try to dig further into the code and understand why this is true.

>All the files I imported from WinXP that I work on in Xy were,
>then copied into Ubuntu, in uppercase. I have had no problem
>opening these files in Xy, though once they are edited and
>saved, their filenames are lowercased.
Darn. More evidence of "erase, create-new" logic in there somewhere, as I discussed above.

>(BTW, Dosbox -- in Linux -- creates its filenames in uppercase,
>and I'd bet this is its standard behavior across platforms.)

I think that is a very significant observation. Thanks for that also.
Paul, thanks very much for your observations on this stuff, and you're attention to detail. With respect to continuing this discussion, please make any observations that you think are significant, but I suspect that I'm going to be hanging back for a while while I really take a look at the code and try to understand it. So if takes me a while to respond, don't assume that the conversation is dead. (If/when a response is due from me, I'll try to check in at least once per week.)

Wally Bass