Home » U++ Library support » U++ Library : Other (not classified elsewhere) » Building & using U++ without TheIDE
| Re: Building & using U++ without TheIDE [message #11462 is a reply to message #11460] |
Fri, 14 September 2007 02:21   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
| luzr wrote on Fri, 14 September 2007 00:58 |
| Quote: |
3) I don't understand how unicode is implemented. There is String, AString, WString, but there is no TString, or whatever the name, like there is TCHAR that expands into char or wchar_t, depending on whether UNICODE/_UNICODE is defined. How do I define whether I'm in unicode or not?
|
This is sort of irrelevant. There is no UNICODE mode. All the time you have 8-bit and 16-bit String/WString.
Recommended approach is to use UTF-8 encoding. In that case, both strings can contain unicode and there is simple conversion between them. (You can however use one of 15 WIN/ISO encodings as 8-bit default.
| Quote: |
I mean, MessageBox will expand into MessageBoxA or MessageBoxW?
|
Always into MassageBoxA. However, in U++ you rather use Prompt, which can work with UTF8.
| Quote: |
And why path handling routines use char - can I handle unicode filenames with U++?
|
Unfortunately, there is drawback caused by fact that we still have to support win98, so we cannot use W variants . In practive, this is really minor trouble, but nothing to be happy about it.
Anyway, when you are using only functions from U++, there is automatic conversion between U++ default encoding and 8-bit encoding of Windows.
Mirek
|
That would be one pretty serious drawback IMHO. I'm using Russian and Hebrew, and I've used so many apps with bad unicode support that I really want to avoid making another one. Who uses Win98 nowadays (and there's unicode layer for these)
I've tried entering non-English chars in TheIDE source editor, it didn't let me. So I made a UTF-8 file containing a name of a file (with both Hebrew and Russian characters). U++ was able to read the name but unable to load the file. I guess this is the reason:
handle = CreateFile(ToSystemCharset(name),
Since this is CreateFileA, it can't open files with unicode filenames.
I tried adding #define UNICODE and #define _UNICODE. After commenting some stuff (cAlternateFileName didn't exist) it built the program but didn't work. CreateFileW can't accept UTF-8, it needs unicode.
Could this be fixed by some creative editing of Util.h? And what about window title, and controls? I mean, any chance to bring this to the TCHAR style, so the program can be compiled in ANSI, and in Unicode (supporting several languages simultaneously)? This should be a matter of some defines and a function to convert UTF8 to Win32 unicode. Or are there other parts of U++ that rely on a specific String/WString encoding?
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11488 is a reply to message #11360] |
Sat, 15 September 2007 18:57   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
I made some progress.
Tried to implement full Unicode (TCHAR/String/WString handling), but eventually TheIDE/debugger went crazy (stopped several lines after breakpoint, for the same #ifdef flagUNICODE in one place in file went to #ifdef, another place to #else). So I dropped the idea for a while and returned to static lib building.
I made a mini-parser for *.upp files. Used it to generate list of used files, deleted them, ended up with unused/forgotten files. I'm attaching a zip of persumably forgotten files, maybe they should be removed.
I also found several referenced (used) yet non-existant packages:
TCore (Geom/Coords)
TCtrlLib (Geom/Ctrl)
TCtrlLib/Calc (Ole/Ctrl/Calc)
TDraw (Geom/Draw)
VectorDes (ide/VectorDes - it actually references itself, and by short path)
Additionally, directory separators aren't consistent - in some files '/' is used, in others '\'.
I also noticed that in plugin/png, most files are "pseudo-forgotten" - they don't appear in *.upp, yet they are included in pnglib.c. Does this happen in any other package?
Are there any plans to clean up the source? These inconsistencies are somewhat preventing me from completing the project generator.
-
Attachment: Unused.zip
(Size: 487.30KB, Downloaded 450 times)
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11489 is a reply to message #11462] |
Sat, 15 September 2007 20:11   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Thu, 13 September 2007 20:21 |
That would be one pretty serious drawback IMHO. I'm using Russian and Hebrew, and I've used so many apps with bad unicode support that I really want to avoid making another one. Who uses Win98 nowadays (and there's unicode layer for these)
|
Well, I agree. Believe me, I would be more than happy to be finally able to cancel Win98 support and make everything UNICODE.
But even year ago there were people asking for *TheIDE* to run 98.... I can undestand that people want apps to be Win98 compatible, but developing on it?...
| Quote: |
I've tried entering non-English chars in TheIDE source editor, it didn't let me.
|
Actually, this is different reason. See FAQ. (In short, theide keeps track of encoding of file used and if keystroke does not match encoding, it is not inserted-> switch file encoding to UTF-8.)
| Quote: |
Could this be fixed by some creative editing of Util.h?
|
Yes. In fact, the code is there for PocketPC version.
| Quote: |
And what about window title, and controls?
|
Window title already works both in Win98 and XP UNICODE (it is quite easy to provide both way).
All widgets are UNICODE capable, there really is no trouble.
| Quote: |
I mean, any chance to bring this to the TCHAR style, so the program can be compiled in ANSI, and in Unicode (supporting several languages simultaneously)?
|
No. Forget about TCHAR style. But it can be easily done just right:)
Which makes me thing, this can even be done right while still supporting Win98, it is just much more work (need to test platform and dynamically link A or W version). .dli would help.
| Quote: |
This should be a matter of some defines and a function to convert UTF8 to Win32 unicode.
|
Well, in fact, these functions are already there and they work. In fact, U++ has no problem to handle chinesse filenames (because chinesse windows are using multibyte 8bit encoding of characters, something similar to UTF-8, and it is trivial to convert this to UTF-8 and back...).
| Quote: |
Or are there other parts of U++ that rely on a specific String/WString encoding?
|
Well, once you have String<->WString conversion and default encoding (SetDefaultCharset), the rest of code is pretty encoding ambivalent. The only real problem is the one you have encountered - the outside world interface. This is basically filenames and fonts; fonts are working UNICODE in Win98 out of the box, so really the remaining problem is filenames.
But once again, current pragmatic solution is just a result of fact that 2 years ago we had customers that required Win98 support...
Mirek
PS.: BTW, Hebrew... Is not it RTL? (Another not yet resolved problem, this time because we just do not know how RTL is supposed to work...).
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11492 is a reply to message #11489] |
Sat, 15 September 2007 21:22   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Have you tried using MS Unicode Layer for Win9x? I never did, but from the description it doesn't look much complicated. That way all builds could be Unicode.
Actually TCHAR is supposed to work. I started to do it, didn't have the chance to complete (see previous post), but in theory my solution should just work.
I assumed that String always contain UTF-8, and WString always contain UTF-16. I also assumed that String.ToWString and WString.ToString convert between them properly. And I used Win32 functions - MultiByteToWideChar and WideCharToMultiByte (have no idea how stuff like that works on Linux).
TSTR is a class I made, that would be an interface between String/WString and Win32 functions taking strings. Example:
String s = "text"; WString ws = "cap";
MessageBox(NULL, TSTR(s), TSTR(ws), MB_OK);
If UNICODE is defined, MessageBox would be MessageBoxW, and TSTR would transform String and WString to WCHAR*. Otherwise MessageBox would be MessageBoxA, and TSTR would transform both to char* (ANSI charset, removing unknown characters). This should compile just fine on Win9x without UNICODE defined, and on WinXP with/without UNICODE defined.
In the file there's also UTFBOM, since I couldn't type in TheIDE (thanks for telling how) I wrote some file handling.
Usage (read any encoding with BOM into UTF-8 string without BOM):
String ftxt = LoadFile("File.txt");
String sf; WString wsf;
int res = UTFBOM::ReadBOM(ftxt, &sf, &wsf);
String strUTF8;
if (res < 2) strUTF8 = UTFBOM::WriteBOM(sf, 1, false);
else strUTF8 = UTFBOM::WriteBOM(wsf, 1, false);
There are probably bugs in the files (couldn't test thoroughly), yet I believe the concept is correct.
Hebrew is indeed RTL. RTL by itself isn't complicated. Flip your screen vertically - that's pretty much what you get. Fully RTL indeed flips icons, minimize|maximize|close becomes close|maximize|minimize on the left side of the screen. Cursor advances to the left when typing. Home key brings cursor to right, End to left (in LTR Home goes left and End goes right). Backspace deletes symbol to the right, Delete deletes symbol to the left. Thinking of it now, if you flip LTR screen, the only thing that will be different from RTL screen is the left/right keys. That is, left remains left and right remains right. Only that in RTL right actually goes towards beginning of text, not end.
Now, that was the correct behaviour, at least what I'm used to (MS Word is quite good in Hebrew). Problems arise when combining RTL with LTR, especially if you also insert "neutral" symbols like !,.?
Then most text/word processors go crazy and results are rather unpleasant. MS Word is good, yet it isn't perfect.
(LTR TEXT)|(RTL TEXT)
| is the cursor. Press right - you'll probably get the cursor about here:
(LTR TEXT)(RTL TEX|T)
Because of the direction change. That also depends on paragraph orientation - whether the paragraph is RTL or LTR makes a difference. The behavior of Left/Right/Home/End/etc. keys is usually like that only in RTL paragraphs, and would be weird in LTR paragraph just like English would type weird in RTL paragraph (try it - right Ctrl+Shift in a textbox). Not exactly intuitive, and I'm not sure if that's standards-conforming behavior, but that often happens.
Does U++ work right with RTL-only (single language)? E.g. correct Left/Right/Home/End/Backspace/Delete behavior? That shouldn't be so difficult to implement.
P.S. please tell me what can be done about source packages. I don't know whether these are indeed mistakes, or I'm again doing something wrong.
-
Attachment: TStr.h
(Size: 6.13KB, Downloaded 596 times)
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11493 is a reply to message #11492] |
Sat, 15 September 2007 22:00   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Sat, 15 September 2007 15:22 | Have you tried using MS Unicode Layer for Win9x? I never did, but from the description it doesn't look much complicated. That way all builds could be Unicode.
|
Well, but that is not out-of-box solution...
| Quote: |
Actually TCHAR is supposed to work. I started to do it, didn't have the chance to complete (see previous post), but in theory my solution should just work.
I assumed that String always contain UTF-8, and WString always contain UTF-16. I also assumed that String.ToWString and WString.ToString convert between them properly. And I used Win32 functions - MultiByteToWideChar and WideCharToMultiByte (have no idea how stuff like that works on Linux).
TSTR is a class I made, that would be an interface between String/WString and Win32 functions taking strings. Example:
|
Actually, there is already existing TSTR there, it is called FromSystemCharset and ToSystemCharset (you need two ways of conversion). See in util.cpp how it is defined for PocketPC - in that case, ToSystemCharset is the same thing as your TSTR...
Anyway, I would still rather used dynamic loading.
| Quote: |
In the file there's also UTFBOM, since I couldn't type in TheIDE (thanks for telling how) I wrote some file handling.
Usage (read any encoding with BOM into UTF-8 string without BOM):
|
What is BOM?
| Quote: |
Now, that was the correct behaviour, at least what I'm used to (MS Word is quite good in Hebrew). Problems arise when combining RTL with LTR, especially if you also insert "neutral" symbols like !,.?
|
Yes, that is exactly the trouble I am afraid of...
Mirek
|
|
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11496 is a reply to message #11495] |
Sat, 15 September 2007 22:48   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
Ah, thanks, I did not knew that there is one for UTF-8...
So, if I understand you well, EF BB BF at the start of UTF-8 sequence should be ignored, right?
Anyway, this rather looks like file issue... Not sure it should be part of basic UTF-8 code?
| Quote: |
Another problem is that ToSystemCharset always returns String, which is castable to char*, and TSTR is castable to _TCHAR - which would always be what Win32 functions expect.
|
Ah, I see. You just did not undestood me. Look at PocketPC versio - that one returns WString instead of String. The result of these goes always directly to system calls (or, for FromSystemCharset, directly uses value returned by system call), therefore this is possible.
IMO, all really need to do to try playing with UNICODE/TCHAR is to change #ifdefs to use PocketPC (WinCE) version for UNICODE too.
| Quote: |
Is UTF-8 correctly converted in String <-> WString conversions?
|
I hope so, for basic plane and except the BOM. There is one extension though: Wrong UTF-8 sequences are interpreted byte by byte as characters 0xEExx (where xx is the byte).
The purpose is simple: This way, you can convert ANY input data to WString and back without loosing any information. This proved the absolute neccessity if you have the editor capable of handling multiple encodings in single file.
(That is why we named this UTF-8EE, as "Error Escape" or "placing to 0xEExx")
| Quote: |
And are other plane symbols also supported (symbols that take 4 bytes in UTF-16)?
|
No, unfortunately, other planes are not implemented yet.
Mirek
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11502 is a reply to message #11496] |
Sun, 16 September 2007 01:43   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Yes, UTF-8 can have BOM. Actually it has, and all programs I've used write/recognize it, with the unfortunate exception of GCC (it takes UTF-8 without BOM - such files aren't always shown correctly in text editors).
There is no need to modify existing UTF-8 handling since I doubt BOM is used in strings (it's essential for files). Yet my UTFBOM might be useful since LoadFile/SaveFile aren't encoding-aware, and files that are saved in UTF-16 format (that's what I commonly use for non-English text) would be loaded as ANSI/UTF-8. Or maybe just add autodecode to LoadFile and optional encoding params to SaveFile.
Multiple encodings in one file? Any examples? I don't think any text editor would recognize such a file.
I tried your suggestion about WinCE. Is PocketPC Unicode-only (that would be awesome if MS actually made such good decision)? First of all, cAlternateFileName, from Path.h, is never defined. As such, GetMSDOSName can't be defined either, and can't be used in FileSystemInfo::Find. That's likely a bug - I just commented everything out, or is there a way to implement GetMSDOSName?
Well, it didn't really work. The same "craziness" returned. I replaced:
#ifdef PLATFORM_WINCE
with:
#if defined(PLATFORM_WINCE) || defined(UNICODE)
and define UNICODE in main.cpp:
#define UNICODE
#include <Core/Core.h>
using namespace Upp;
...
Still, in the debugger it insisted to go into the #else. Rebuild all didn't help. My guess is that the solution could've worked (WString should cast into WCHAR*).
BTW, I've found some mistakes in UTFBOM and fixed that. Now I seem to be able to load/save UTF-8/UTF-16 LE/BE with/without BOM. I'm attaching the updated code (class + demo that I tried to use to read unicode-filenamed file). UTF-32 is kinda rare, though I might add it too for sake of completeness. However, that would require a String/WString that can work with embedded nulls - can they?
As for RTL, if I have time, I'll read Unicode specs on how LTR and RTL is mixed in same paragraph. The issue is quite interesting, but I'm afraid a standards-conforming solution might end up being an unusual one since many programs tend to ignore the existance of RTL languages.
P.S. is there a portable way to get a key from console (only key, without Enter)? Like _getch()?
P.S.2 String (UTF-8) seems to be the native U++ representation of text. Is WString used / recommended to be used for anything
besides Unicode OS API calls?
-
Attachment: UniTest.cpp
(Size: 3.78KB, Downloaded 661 times)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11548 is a reply to message #11360] |
Mon, 17 September 2007 11:07   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
It's UTF-16. And the problem is solved by ignoring it - non-main plane characters aren't supported.
My attempts of creating a static lib have slowly become attempts to simulate BLITZ outside TheIDE. Overally, they ended up successfully. This isn't BLITZ yet, but build times are better than with precompiled header. EXE size is bigger, though (the way I implemented this, packages are either not included, or included as a whole - so even a basic GUI app would have all widgets and lots of other packages compiled and linked).
Animated Hello example (full rebuild):
Code::Blocks svn4421 / MinGW 3.4.5 (Debug) : 1:18 / 241 warnings / 11.7MB.
Code::Blocks svn4421 / MinGW 3.4.5 (Release): 2:59 / 252 warnings / 3.3MB.
TheIDE 708dev2b / MinGW 3.4.2 (Debug): 1:11 / 0 warnings / 13.2MB.
TheIDE 708dev2b / MinGW 3.4.2 (Optimal): 2:39 / 1 warning / 1.6MB.
Modifying the source to work as SCU on a per-package basis isn't so difficult. There were several "cosmetic" changes - adding underscores / commenting to prevent "redefined" errors (there are very few conflicting symbol names in U++, yet they exist). But there was one large change - instead of the included 1.1.4 / 1.2.2 zlib, I had to use zlib 1.2.3 and seriously modify it. For one, in SCU all files are compiled as C++, and zlib is K&R C - modifying all function declarations was necessary. Then, there were many conflicts between different c files, more in overall than int whole U++ without zlib. I couldn't resolve the conflicts in bundled zlib version, but I managed to in zlib 1.2.3 (underscores, include guards, etc.).
I could upload modified files if anyone is interested in trying it out / modifying main source. Would be nice if main U++ source could include at least the minor changes (without zlib), that way only replacing zlib would be necessary.
|
|
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11553 is a reply to message #11552] |
Mon, 17 September 2007 15:46   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Mon, 17 September 2007 07:50 | I successfully compiled Animated Hello example, and it ran fine (jumping colorchanging letters). I'll try other examples later, though I'm afraid they might not work due to the way TheIDE treats .lay and .iml files (other IDEs probably won't know what to do with these).
|
Should not be the problem, these files are just "compiled" using preprocessor. TheIDE has editors for them, but during build process, they are ignored (they do not have .cpp nor .c extension after all 
| Quote: |
Maintenance would probably consist of watching that .upp files are correct (no broken references), and that there are no conflicting names in the whole U++. The latter is currently easy but might become more difficult in the future. That's because U++ has lots of functions and enums, that are members of the namespace yet aren't members of any class. That seriously increases the chance that 2 unrelated cpp files will have a function with the same name - erroneous when compiling as SCU.
|
a) This is unlikely. Also, just the same name is not enough, only same signature is the problem.
b) I do not quite understand this SCU issue. I thought that task is to make possible to use U++ with CodeBlocks and VisualC++. I believe that users rather expect .lib files?
| Quote: |
Please tell me if it's fine to update zlib to a newer yet modified version. Plus, it should get tested (any simple U++ programs that could be used to test if zlib works correctly?). I'll try to make smallest-possible changes list to the main sources.
|
Yes, it is OK. Alternatively, please put updated package here so that we can pick it up for the next dev release.
Mirek
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11562 is a reply to message #11553] |
Mon, 17 September 2007 18:33   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
| luzr wrote on Mon, 17 September 2007 15:46 |
a) This is unlikely. Also, just the same name is not enough, only same signature is the problem.
b) I do not quite understand this SCU issue. I thought that task is to make possible to use U++ with CodeBlocks and VisualC++. I believe that users rather expect .lib files?
Mirek
|
b) I started with the goal of .lib, yet now I find it rather unattractive due to the 410MB debug lib. With a lib there are 2 possibilities - redistribute it, or redistribute a project to easily build it. 410MB makes the former impossible, and the latter would require maintaining projects for different compilers/IDEs. + even with precompiled headers build time is about 10 mins vs ~1 min in TheIDE.
So what I did now, instead, is use SCU approach to drastically reduce compilation time (it's not much worse than BLITZ's now), partially at the cost of EXE size non-modularity (simple GUI and complex GUI apps will have the same big EXE since the whole CtrlLib package is linked).
I have an interface header for each package, implementing SCU, so when I do: #include <Upp/CtrlLib.h> it's like adding CtrlLib package in TheIDE. These headers are auto-generated from U++ source (using .upp files). So basically, user can work in Code::Blocks, without a static lib (saving space and better debugging), yet with similar fast compiles. Environment becomes quite similar to TheIDE, though there are drawbacks - no embedded help / .lay and .iml editors, and larger EXE size.
I believe these drawbacks aren't too big, and I don't rule out the possibility of building (based on PCH, not SCU) and redistributing a release-only lib to reduce final EXE size.
a) Unlikely? Yes. Yet possible and it happens. IsLeapYear is once a function, another time a macro. BINS is defined in heap and in draw palette, with different values. INITBLOCK/EXITBLOCK also cause conflicts since some happen to appear on the same line number. z.cpp (zlib) is for some reason in Core, and it redefines ASCII_FLAG, HEAD_CRC etc. already defined in plugin/z. RichText/Para.h has Code enum, yet Code is #defined in plugin/z/lib/deflate.h. These are the minor changes in main U++ source I was talking about. That, zlib, and a few casts in png to make it C++ compatible.
P.S. tried to compile on MSVC8. This piece from deflate.h:
#define Freq fc.freq
#define Code fc.code
#define Dad dl.dad
#define Len dl.len
Doesn't let me compile since Code and Len are names of parameters of functions in winnt.h. => RichText/Para.h is fine, zlib should be further modified.
I think I'll just prefix everything troublesome in zlib with "zlib_". Should work. Any demo projects to test zlib? I want to make sure that the library still works after my modifications.
|
|
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11571 is a reply to message #11569] |
Mon, 17 September 2007 23:47   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
| luzr wrote on Mon, 17 September 2007 22:28 |
| sergei wrote on Mon, 17 September 2007 12:33 |
redistributing a release-only lib to reduce final EXE size.
|
I would rather follow this approach.
Or perhaps even better, you can make debug version - with runtime checks but without the debug info.
BTW, as I think you know, you can run TheIDE in commandline mode too, so automated generation tool does not rule out use TheIDE as build tool. I am even willing to add functions to generate .libs directly - but I do not really know how to group all the stuff to libs.
Mirek
P.S.: Looking at all troubles you have with this reminds me why we have created theide For the first 2 years, U++, named SQL++ back then, was developed with Visual Studio 6.0...
|
Well, I want to keep SCU approach for debug. Debug EXEs are about 13MB - not too much. The heavy point in favor of this system is being able to step in U++ source during debug. + You could also easily modify U++ source while developing/debugging. So IMHO debug lib isn't necessary.
Release lib is also not necessary. It would be just an improvement for EXE size (I could create a lib with SCU approach, but it wouldn't make any difference - including a package would link all of it).
The problem with precompiled header vs SCU is the need to create project files. With SCU you can redistribute the source, user just includes necessary packages. With precompiled header you have to setup project files with all cpps + precompiled header to build a lib. Which isn't trivial, since not all cpps actually should be compiled. So, SCU gets a point for maintainability. Precompiled header is worth it only in release - smaller EXE. And only for programs that don't utilize most of the code in packages they use.
More interesting solution could be splitting packages, like instead of one huge CtrlLib, several with different kinds of widgets. But that's probably just too much work...
P.S. devpacks are released every 2-3 weeks, right? Can I get the current source tree somewhere?
|
|
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11574 is a reply to message #11493] |
Tue, 18 September 2007 02:35   |
 |
tvanriper
Messages: 85 Registered: September 2007 Location: Germantown, MD, USA
|
Member |
|
|
| Quote: |
sergei wrote on Sat, 15 September 2007 15:22
Have you tried using MS Unicode Layer for Win9x? I never did, but from the description it doesn't look much complicated. That way all builds could be Unicode.
luzr wrote on Sat, 15 September 2007 16:00
Well, but that is not out-of-box solution...
|
I have had quite a bit of experience with using UNICOWS, and can at least comment on its use, and perhaps offer an alternative idea.
It is, and it isn't an out-of-box solution. To use it, you have to link to a .lib file provided by Microsoft in a special way (it has to be the first library linked, then subsequent libraries can link... otherwise the system won't work). And you have to distribute a 'unicows.dll' (if I remember the name properly) with the application on Win9x-derived OSes (but you don't have to distribute it on non-Win9x-derived OSes). Otherwise, the application will call the Unicode versions of the Win32 API calls, and Mysteriously Bad Things Will Happen.
So, yeah, you could distribute applications with it 'out-of-the-box' in many cases, but for Win9x, you'd have to drop in that DLL. And... well I don't know Microsoft's position on the distribution of this library (or the .lib file, for that matter). I wouldn't think they'd have a problem, but who really knows? At the very least, it's kind of painful to set up.
All of this said, they aren't really doing anything that's terribly mysterious. I haven't looked over all the Win32 API calls you're making, but chances are you could probably do something like unicows.dll yourself by reproducing the wide versions of the function calls yourself, having them call the ANSI versions on 9x systems (doing the Unicode-to-ANSI conversions between the wide/narrow API calls), or pass-through to the Unicode versions on other systems.
Depending on the volume of code involved, this might not be so bad. Or, it could be a nightmare. I don't know... I haven't looked too deeply into Ultimate++'s inner-workings to see how much of the Win32 API you're using.
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11577 is a reply to message #11360] |
Tue, 18 September 2007 03:04   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Update: I've found a flaw in my SCU method. Since main.cpp will contain all U++ code due to the includes, any change in main.cpp will require full rebuild - not nice. I've reworked the structure - now there are 2 additional files - UppBase.h and UppBase.cpp. To use U++, user should copy & add both to his project. UppBase.h may be included in all source files that use U++. Also, UppBase.h contains #includes of packages that should be used. UppBase.cpp is a helper source, that will be the SCU. Unless it's changed (and it will change only if UppBase.h changes - which probably happens only when packages set changes), full rebuild of U++ won't be necessary.
Surprisingly, this method allowed me to detect more bugs. That's because all U++ headers are included before the first U++ cpp. Example: IsClipboardFormatAvailable is used in Draw/MetaFile, and it is defined in CtrlCore/CtrlCore.h. Seems fine, usually is fine, but actually incorrect. Draw package doesn't declare in uses (Draw.upp) that it uses CtrlCore, yet it uses its function. Not sure what's the best solution (easiest is to add CtrlCore to Draw's uses).
P.S. .upp inconsistencies that still haven't been corrected (as of 709dev1):
ide/VectorDes uses VectorDes (itself?, and wrong folder)
Ole/Ctrl/Calc and some Geom packages use T??? packages (such don't exist)
coff/uar/uld/uar.upp - probably should delete whole folder...
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11590 is a reply to message #11360] |
Tue, 18 September 2007 18:05   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
OK, the issue with clipboard had nothing to do with dependencies, it was an API call mistaken for U++ function by the compiler. :: solved it.
I've updates to SP1, yet the error didn't go away. At least now it's preceded with C1001 - internal error of compiler. Again, release-only problem, debug works. Compiler's error is on line 54 of RichText/txtop.cpp. Which is: if(update). Yeah, that's one complicated line
Any help would be welcome.
That's what I got:
d:\programming\upp\richtext\txtop.cpp(54) : fatal error C1001: An internal error has occurred in the compiler.
(compiler file 'F:\SP\vctools\compiler\utc\src\P2\main.c[0x10BF5F00:0x00000 02C]', line 182)
To work around this problem, try simplifying or changing the program near the locations listed above.
For the record, F: is a CD drive
Currently I'm rather pleased with the results (MS bug isn't my fault...). I'll test some more example projects (see how .iml and .lay work, test zlib), and retry unicode filenames. Then I'll upload everything.
For now, I'm attaching Animated Hello project. It won't compile without packages headers I've generated, but I'd like to know what you think about the way a typical U++ project on CodeBlocks/MSVC would look (I mean the sources).
P.S. I decided to try out ld and ar replacements.
Animated Hello full rebuild (debug + release):
MinGW 3.4.5 (original) : 4:27 / 562 warnings / 11.9MB + 3.3MB
MinGW 3.4.5 (U++ ld and ar) : 4:14 / 562 warnings / 11.1MB + 3.2MB
-> 5% reduction in build time, 7% reduction in debug EXE size, 1% reduction in release EXE size. Are these the common results? And is it safe to use these programs (I'm somewhat uncomfortable with patching the compiler).
-
Attachment: UppTest.zip
(Size: 4.69KB, Downloaded 418 times)
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11591 is a reply to message #11590] |
Tue, 18 September 2007 19:04   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Tue, 18 September 2007 12:05 | OK, the issue with clipboard had nothing to do with dependencies, it was an API call mistaken for U++ function by the compiler. :: solved it.
I've updates to SP1, yet the error didn't go away. At least now it's preceded with C1001 - internal error of compiler. Again, release-only problem, debug works. Compiler's error is on line 54 of RichText/txtop.cpp. Which is: if(update). Yeah, that's one complicated line
Any help would be welcome.
That's what I got:
d:\programming\upp\richtext\txtop.cpp(54) : fatal error C1001: An internal error has occurred in the compiler.
(compiler file 'F:\SP\vctools\compiler\utc\src\P2\main.c[0x10BF5F00:0x00000 02C]', line 182)
To work around this problem, try simplifying or changing the program near the locations listed above.
For the record, F: is a CD drive
|
Hehe, that one is well known here:) The same result with MSC7.1 and MSC8.
For regular work with theide, it is non-issue as for release mode BLITZ is not recommended anyway as it produces longer .exes.
| Quote: |
P.S. I decided to try out ld and ar replacements.
Animated Hello full rebuild (debug + release):
MinGW 3.4.5 (original) : 4:27 / 562 warnings / 11.9MB + 3.3MB
MinGW 3.4.5 (U++ ld and ar) : 4:14 / 562 warnings / 11.1MB + 3.2MB
-> 5% reduction in build time, 7% reduction in debug EXE size, 1% reduction in release EXE size. Are these the common results?
|
Well, perhaps for rebuilding everything, reduction in build time is not that significant. But if you are rebuilding just single file while developing, which usually takes about 2-3s, that 13s difference is quite welcome.
| Quote: |
And is it safe to use these programs (I'm somewhat uncomfortable with patching the compiler).
|
These really are not patches. These replacements are rewrites from the scratch, using U++ Core (NTL). That is where the speed comes from 
As for bugs, who knows... But originals are not completely bug-free either.
Mirek
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11604 is a reply to message #11360] |
Wed, 19 September 2007 00:43   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Update:
I'm done implementing unicode. I don't exactly like the way I did it, but it works. I prefer some global strings manager like the TSTR I suggested, but since Mirek said he doesn't want TCHARs I've updated all API calls manually. I added PLATFORM_UNICODE define, and replaced many #ifdef PLATFORM_WINCE with it. Conversion mostly implemented through ToSystemCharset and FromSystemCharset.
In the process I've also found and eliminated a "security vulnerability" (that's what memory bugs are called nowadays ) in Log.h:
sprintf(h, "* %s %02d.%02d.%04d %02d:%02d:%02d, user: %s\n",
(const char*)FromSystemCharset(exe),
t.day, t.month, t.year, t.hour, t.minute, t.second, (const char*)FromSystemCharset(user));
I added the (const char*), otherwise I got segmentation fault in debug. Probably without the explicit cast sprintf thought that String is a char array.
I received another segmentation fault earlier (but maybe it was this one...), and that one was solved by replacing all To/FromSysChrSet with To/FromSystemCharset. Not a big deal IMHO, they were all used in Win32-specific code anyway...
I've tested unicode filenames - it did open a multilingually-named file and read its content successfully. I've also tested registry - successfully wrote that filename to a REG_SZ key, it's fine. Didn't try to create unicode-named keys (don't wanna kill my windows).
Next (final) step - zlib/lay/iml/images testing. If that goes well, I'll upload everything.
P.S. I'm working from Code::Blocks when editing U++ source, rebuild for console project is half a minute. So I don't want a static lib for debug Though GDB / Code completion don't work that well - GDB reports most stuff in U++ source as incomplete type (so I have to Cout whatever I want to see), and Code completion usually can't find definition of things (but it does know function prototypes).
P.S.2 since you know about the MSVC bug, did anyone report to MS? There might be a chance that they fix it...
Edit: I think UTFBOM class I posted above, or something else implementing that functionality, should be added to some place in U++. That way unicode support will be complete - unicode filenames + unicode text.
[Updated on: Wed, 19 September 2007 01:07] Report message to a moderator
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11605 is a reply to message #11604] |
Wed, 19 September 2007 09:34   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Tue, 18 September 2007 18:43 | Update:
I'm done implementing unicode. I don't exactly like the way I did it, but it works. I prefer some global strings manager like the TSTR I suggested, but since Mirek said he doesn't want TCHARs I've updated all API calls manually. I added PLATFORM_UNICODE define, and replaced many #ifdef PLATFORM_WINCE with it. Conversion mostly implemented through ToSystemCharset and FromSystemCharset.
In the process I've also found and eliminated a "security vulnerability" (that's what memory bugs are called nowadays ) in Log.h:
sprintf(h, "* %s %02d.%02d.%04d %02d:%02d:%02d, user: %s\n",
(const char*)FromSystemCharset(exe),
t.day, t.month, t.year, t.hour, t.minute, t.second, (const char*)FromSystemCharset(user));
I added the (const char*), otherwise I got segmentation fault in debug. Probably without the explicit cast sprintf thought that String is a char array.
I received another segmentation fault earlier (but maybe it was this one...), and that one was solved by replacing all To/FromSysChrSet with To/FromSystemCharset. Not a big deal IMHO, they were all used in Win32-specific code anyway...
|
Uh oh, that is no good. There definitely should be FromSysCharset, which returns a pointer to static char * array.
The reason is that Log is supposed to work independently from the rest of U++, so that it can be used in situation when the rest failed or is not available. By using vanilla FromSystemCharset, you make it dependent on String and in turn on memory allocator. Therefor it will not work when heap crashes or cannot be used to debug memory allocator... 
| Quote: |
I've tested unicode filenames - it did open a multilingually-named file and read its content successfully. I've also tested registry - successfully wrote that filename to a REG_SZ key, it's fine. Didn't try to create unicode-named keys (don't wanna kill my windows).
|
Sort of redundant work. The final solution will use dynamic .dll loading to choose between W and A variants.
| Quote: |
P.S.2 since you know about the MSVC bug, did anyone report to MS? There might be a chance that they fix it...
|
No.
| Quote: |
Edit: I think UTFBOM class I posted above, or something else implementing that functionality, should be added to some place in U++. That way unicode support will be complete - unicode filenames + unicode text.
|
Well, the question is the definition. Thinking about it, I just fail to see what is the exact description. If detection of utf-8 files is the purpose, that can be easily done withou specific code. Also, what is going to happen if there is no UTFBOM?
Should it be available as
bool SkipUTFBOM(Stream& in);
const char *SkipUTFBOM(const char *s);
?
Mirek
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11607 is a reply to message #11605] |
Wed, 19 September 2007 10:12   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
I see now... I knew there should've been a good reason for a second set of charset functions. OK, I'll reverse the changes and see if that works.
About registry:
bool SetWinRegString(const String& string, const char *value, const char *path, HKEY base_key) {
HKEY key = 0;
if(RegCreateKeyEx(base_key, ToSystemCharset(path), 0, NULL, REG_OPTION_NON_VOLATILE,
KEY_ALL_ACCESS, NULL, &key, NULL) != ERROR_SUCCESS)
return false;
#ifdef PLATFORM_UNICODE
WString wstring = string.ToWString(); wstring.Cat(0, 1);
bool ok = (RegSetValueEx(key, ToSystemCharset(value), 0, REG_SZ, (const byte*)(const wchar*)wstring, (wstring.GetLength() + 1)*2) == ERROR_SUCCESS);
#else
bool ok = (RegSetValueEx(key, value, 0, REG_SZ, (const byte*)(const char*)string, string.GetLength() + 1) == ERROR_SUCCESS);
#endif
RegCloseKey(key);
return ok;
}
The #else part is what was prevoiusly the code (I added the casts, though). Linking that to W version wouldn't be possible - defining UNICODE without #ifdef would cause an error of cast of char* to WCHAR*.
I found that I didn't modify everything, since console apps don't include much Then I found that already exists a macro L_(). Thus using TCHAR instead of char/wchar, + L_() and To/FromSystemCharset might indeed remove these #ifdefs. But that would be later, first to ensure everything works.
I'm not sure how you want to use dynamic dll loading. Change all #ifdefs into if/elses, and explicitely call W and A versions, to enable runtime switching between ANSI/Unicode?
UTFBOM: Skip BOM of UTF-8 / UTF-16 LE / UTF-16 BE files (not only UTF-8), and read ASCII/UTF-8 (if there's no BOM, it's considered ASCII) into String, UTF-16 LE/BE into WString. Convert UTF-8 String into UTF-8 / UTF-16 LE/BE with/without BOM. I guess it should be:
int FromFileCharset(const String& s, String* os, WString* ows);
String ToFileCharset(const String& s, int bytes, bool BOM = true, bool LE = true);
String ToFileCharset(const WString& s, int bytes, bool BOM = true, bool LE = true);
(maybe should add ASCII -> UTF-8 conversion if there's no BOM, since chars > 127 could cause invalid UTF-8, being just system-charset chars)
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11609 is a reply to message #11607] |
Wed, 19 September 2007 10:45   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Wed, 19 September 2007 04:12 |
I'm not sure how you want to use dynamic dll loading. Change all #ifdefs into if/elses, and explicitely call W and A versions, to enable runtime switching between ANSI/Unicode?
|
Yes. With .dli, it is not as much trouble as it seems. In fact, you forced me to work on it right now 
| Quote: |
UTFBOM: Skip BOM of UTF-8 / UTF-16 LE / UTF-16 BE files (not only UTF-8), and read ASCII/UTF-8 (if there's no BOM, it's considered ASCII) into String, UTF-16 LE/BE into WString. Convert UTF-8 String into UTF-8 / UTF-16 LE/BE with/without BOM. I guess it should be:
int FromFileCharset(const String& s, String* os, WString* ows);
String ToFileCharset(const String& s, int bytes, bool BOM = true, bool LE = true);
String ToFileCharset(const WString& s, int bytes, bool BOM = true, bool LE = true);
(maybe should add ASCII -> UTF-8 conversion if there's no BOM, since chars > 127 could cause invalid UTF-8, being just system-charset chars)
|
I see. In fact you suggest something like LoadUnicodeAny, which detects the kind of file and loads UTF-8 or UTF-16LE or UTF-16BE.
Returning WString (it is easy to convert it to String).
Hm, perhaps there should be two variants after all to avoid unnecessarry UTF-8 -> UTF-16 -> UTF-8 conversion...
Mirek
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11611 is a reply to message #11609] |
Wed, 19 September 2007 11:18   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Why need dli? You already have all functions from #include <windows.h>. The trouble would only be explicitly calling A and W version. Thinking of it, it sounds nice - make PLATFORM_UNICODE a global boolean, initialized to true, unless OS is Win9x. But I'd prefer to finish the way I started to see everything work.
UTF-8 -> UTF-16 -> UTF-8 won't happen. FromFileCharset returns String if it's ASCII/UTF-8 and WString if it's UTF-16. It returns amount of bytes. 0 -> ASCII / String, 1 -> UTF-8 / String, 2-> UTF-16 / WString (4 -> UTF-32 / WString, but not implemented). What could happen is UTF-16 -> WString -> String, but UTF16 -> WString isn't expensive.
I wanted to compile UWord (now in ANSI, GUI Unicode isn't complete yet) to see if zlib work (UWord.iml), and found an interesting problem in PdfDraw:
ScreenDraw sd;
That causes a warning of statement is a reference not a function call. + error about sd definition. In Draw/DrawWin32, ScreenDraw is a class, but also:
ScreenDraw& ScreenDraw()
{
return Single<ScreenInfoClass>();
}
That's a singleton? Whatever it is, it doesn't work - ScreenDraw sd; is recognized as a function name, not class type. Any suggestions how to fix?
P.S. Why does U++ use so many global functions? I prefer .Net-style - tree-like organization using namespaces/classes. After all, gathering functions into static classes should be realtively easy, and at the cost of some extra typing you (potentially) resolve naming conflicts, and make stuff easier to find. E.g. I may not know that there's a function named GetWinRegString hidden somewhere in Core/Win32Com. But if there was a class Registry, it would be more likely that I'd find it by typing Registry::. Plus that would be an OOP approach
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11613 is a reply to message #11611] |
Wed, 19 September 2007 11:39   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Wed, 19 September 2007 05:18 | Why need dli? You already have all functions from #include <windows.h>.
|
Yes, but it would not start on Win98 ("missing dll call").
| Quote: |
UTF-8 -> UTF-16 -> UTF-8 won't happen. FromFileCharset returns String if it's ASCII/UTF-8 and WString if it's UTF-16.
|
How? Sure, you can make it return something more complex, but I would prefer two functions: one returning String and other WString.
| Quote: |
I wanted to compile UWord (now in ANSI, GUI Unicode isn't complete yet) to see if zlib work (UWord.iml), and found an interesting problem in PdfDraw:
ScreenDraw sd;
That causes a warning of statement is a reference not a function call. + error about sd definition. In Draw/DrawWin32, ScreenDraw is a class, but also:
ScreenDraw& ScreenDraw()
{
return Single<ScreenInfoClass>();
}
That's a singleton? Whatever it is, it doesn't work - ScreenDraw sd; is recognized as a function name, not class type. Any suggestions how to fix?
|
Now this is really interesting. It is obvious bug (minor, this is just something forgotten), just tell why in theide it compiles without a error, in both MSC and all versions of mingw? 
Maybe different way of SCU? (Your SCU is bigger than mine? 
| Quote: |
P.S. Why does U++ use so many global functions? I prefer .Net-style - tree-like organization using namespaces/classes. After all, gathering functions into static classes should be realtively easy, and at the cost of some extra typing you (potentially) resolve naming conflicts, and make stuff easier to find.
|
Actually, if you look a bit more carefully, then either these are really very globally used names (e.g. AsString) - there is a limited number of such cases, or they are carefully bound to parameters so that C++ overloading resolves the conflicts.
Believe or not, function name conflicts are not an issue. (Moreover, everything is still in U++ namespace).
| Quote: |
Plus that would be an OOP approach
|
I like multi-paradigm approaches more 
Mirek
|
|
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11617 is a reply to message #11613] |
Wed, 19 September 2007 12:22   |
sergei
Messages: 94 Registered: September 2007
|
Member |
|
|
Regarding UTF - I understand that. I made it take 2 pointers - to String and WString. That was for optimal conversion. I'll split that, and add a function that returns encoding bytes according to BOM - so if user wants to go for the faster conversion, he would be able to choose between String and WString.
Regarding SCU - no idea... My SCU first includes all headers, then all sources (of packages used).
Tested PDF export, now works. Should complete unicode, and test layouts and more images.
P.S. tried Hebrew support in UWord. Pretty bad While it does display correctly what I type, everything else doesn't work. E.g. no switch to RTL button, home/end keys work as in LTR, left/right work for some reason... Worse yet - it doesn't position the cursor where I see it:
(TEX|T)
If I type a letter I get:
(TELX|T)
Same goes for copy/paste of selected text, it doesn't select the correct text (selects same amount from other side).
Even worse in textboxes (like in save file). Not only inserting is wrong, but there's also a sliding problem - when you select text, it changes while you're selecting it (slides into selection from other side). I've seen that in a few apps before, makes editing text impossible.
Could at least partial support be improved? I mean, ignore keys and RTL layout. At least should be WYSIWYG in insert/delete/select/copy. BTW, export to PDF reversed Hebrew text. Expectable, but not nice.
|
|
|
|
| Re: Building & using U++ without TheIDE [message #11618 is a reply to message #11617] |
Wed, 19 September 2007 12:39   |
 |
mirek
Messages: 14291 Registered: November 2005
|
Ultimate Member |
|
|
| sergei wrote on Wed, 19 September 2007 06:22 |
E.g. no switch to RTL button, home/end keys work as in LTR, left/right work for some reason... Worse yet - it doesn't position the cursor where I see it:
(TEX|T)
If I type a letter I get:
(TELX|T)
Same goes for copy/paste of selected text, it doesn't select the correct text (selects same amount from other side).
Even worse in textboxes (like in save file). Not only inserting is wrong, but there's also a sliding problem - when you select text, it changes while you're selecting it (slides into selection from other side). I've seen that in a few apps before, makes editing text impossible.
Could at least partial support be improved? I mean, ignore keys and RTL layout. At least should be WYSIWYG in insert/delete/select/copy. BTW, export to PDF reversed Hebrew text. Expectable, but not nice.
|
Hey, Sergei, I have told you that RTL is not implemented.
In fact, so far there was nobody capable about even commenting this issue. We could base it on some generic info available, but I guess this would not really help... We need a real person who knows how this is supposed to work;)
Mirek
|
|
|
|
|
|
Goto Forum:
Current Time: Thu Jun 18 04:53:04 GMT+2 2026
Total time taken to generate the page: 0.03223 seconds
|