-
Why there is so much talking about “bad code” or “bad practices”? Because they are important!
Lately I had an unpleasant experience with uncommented code, bad design, bad implemented oop, unoptimized and badly designed databases.Comments
Is a great mystery to me how it’s possible that every book and tutorial (not just PHP) to say that comments are not optional but MANDATORY and most often there entirely missing. Zend Studio has a very simple and efficient auto-complete system, you just have to tap “/**” and press enter, and then just complete the text. Netbeans has a similar system, just as easy.
And still, I’ve came over thousands of lines of code with almost no comments at all. The outcome? Hours and hours wasted trying to follow the logic!
Why is this happening? First reason: is boring, a developer want’s to write code not stories, usually seems like wasted time. The second reason: everything seems very logic when it’s written, if it’s so logic and fluent why waste time with stories? Because time passes, projects change, and with time is inevitable that the logic will be forgotten. Or another reason, new employees will come, in companies developers come and go, and the new guy can’t follow the logic with the same ease, in fact it is almost impossible to follow. Even the author of the code can’t follow the steps after a long period of time, sometimes the author was me.
In my opinion this should be a rule of thumb for every company, no class/method/property should be left uncommented. Time spend now on commentaries is time gain later when will be done debugging,optimization etc.
Bad design
I’ve encounter a question on an on-line “mini interview”: “do you see the importance of architect analysis before writing code?”, I’m sorry if the I didn’t get the exact question. The first time I’ve seen that question I had an “deja-vu” moment, a lot of the time I’ve started writing code only to realize that was the wrong approach.
A lot of the times, the issue is solved (apparently) with time and experience. Basically, if you get a beginner to write code, most likely he will have some bad approaches before getting a good one, and this is not abnormal, that’s why I think a beginner should be guided before he will begin to write code, and the resulting code to have a suggested logic by a “mentor”.
On the other extreme there are “software architects” which using UML they describe the logic and the structures using diagrams. When diagrams exist is much easier to follow the entire process and structure of the app. An experienced architect will be able to see the possible issues that may appear before beginning to write code, and when code is starting to be written everyone knows just what they have to do.
OOP is probably the most affected by poor design. Lately I’ve seen a lot of classes which had no internal structure, there ware just simple wrappers for SQL queries. That’s not OOP!
OOP is about abstracting elements in classes and objects. For instance the keyboard is a class which has keys (a child class) with various properties (letters, key code, position), some LEDs (another child class) etc. The way there organize in the database is not necessary in a tight relation with the resulting objects, as it may seem.
If your using OOP and what you are reading now sounds weird, try drawing on a piece of paper a diagram of the objects in your app and the references between them. If you can’t, it means that your approach to the OOP is wrong (or you just don’t know to draw a diagram 🙂 )!
We all make mistakes when it comes to OOP, but that’s not an excuse not to correct them, and to try to make architecture before code.
A bad app design may have very important financial implications. Time is money, and if an app has poor design, is not correctly structured, the debugging time is big, changes and enhancements require a lot of time, is a lot of code redundancy, etc. , then you can be sure your losing money.
A tool that I sometimes use is Violet UML Editor, is not a true UML editor like Rational Rose for instance, but rather an open-source toy. With Violet you can only build visual diagrams, but they can be useful to visually structure an app.
Databases
Why are PHP developers avoiding to truly learn MySQL? Sounds strange? Is very true though. Modifying PHP code is usually not very difficult (I mean the practical rewriting the code), but a bad database design is most of the times more difficult to change because is the risk of losing data.
A few weeks ago I’ve made a diagram of the database using MySQL Dump and MySQL Workbench. I was quite surprise to see tables which didn’t have relation keys with the tables from which the information came from (I don’t mean settings tables or other tables which logically don’t have a relation with the other tables), then the data source was completely lost.
Another classic problem with beginners is that when they have a relation table between two other tables, like categories and products for instance, the primary key is on a field like “id” which has no relevance. A primary key can be set on multiple fields, like for the previous example “id_category, id_product” not “id”, this way you ensure the uniqueness of a product in a category using the primary key restriction.
Another thing that is usually avoided are the indexes. In a previous blog post I was shortly explaining them, insufficient even though there important. An index can significantly reduce the search time in a table, from tenths of a second to a thousandths of a second. A badly optimized app from this point of view can have a significant bigger response time then normal.
Frameworks
To quote a classical phrase in the PHP community:
and Laura Thomson had some strong reasons to back this up.
Somebody was saying last week that the reason for bad code is actually PHP and it’s loose typing. Let’s be honest, if we take in consideration a language like C++ there are a lot more issues that can arise. I remember in faculty how bad my C++ code was, and the problem wasn’t the language but rather my skills at that time. PHP allows approaches from OOP to spaghetti code (OOP, procedural, closures, labels). The fact that many developers chose bad approaches is not a language problem, there is the same approach issue with a language like C++, or in fact with any programming language out there.
Why are less design problems in Ruby on Rails for instance? Because is a framework! I’ve never heard of anybody doing web developing just using Ruby (there are developers out there, especially for desktop apps, but that’s another story), of course there are less issues when using a framework. The same way PHP issues can be reduced using an popular framework.
There are tens or even hundreds of open-source PHP frameworks. Of this there are a few really popular, like Zend Framework, CakePHP, Symfony, Solar, CodeIgniter etc. An great advantage when using a popular framework is that is easy to find professionals. Another big advantage is that you have a well tested and documented code base, thing that is very hard to achieve in a small company.
Or even if your using an in-house framework I thing is a good idea to adopt a structure of an popular framework to reduce the learning curve for new developers.
Using an popular open-source framework usually you reduce the working time and the time to develop nu features because usually there included in the framework, so economical advantages bay arise (money), a better structure and last but not least happier developers (which I’m not at this time).
Concluding:
- set some rules for the code standards, don’t forget to add the comments to the list,
- make sure the app design is according to a plan that allows for scalability and minimal code redundancy,
- make sure the database is well structured and optimized,
- consider an open-source popular framework over building an internal one.
Using this simple rules will save resources, time, and probably developers will be more happy with there result.
-
Today I’ve updated the romanian stemmer class to version 0.6.
It used to display notices, but now there are corrected.
Enjoy!
-
One of the biggest issues with the web is encoding.
In the old days the formerly base standard was ISO 8859-1, where there ware 191 latin characters defined, and 1 char = 1B. For different languages, different encodings ware used, but from here many portability issues appeared, the possibility to cover a greater number of languages etc.
The problem occurs when a project should be available in several languages, and the number of the languages is not controlled. A big project like WordPress for example should be available with any language.
Unicode is a better alternative for ISO 8859-1, having more then 100.000 characters defined. In other words it has about every character of about any existing language.
As I was saying for MySQL, UTF-8 characters have a variable length between 1 and 4B.
Displaying the UTF-8 content in PHP pages
For browser to interpret the page content as UTF-8, it should receive the right headers:
1<?php header("Content-type: text/html; charset=utf-8");?>
Attention! The header should be the first thing that is send from the server! In other words it should be the first thing displayed on the page.
The type of the document can be specified with the “Content-Type” meta tag. If there is a similar meta tag on the page it should be removed and replace with:
1<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
The .htaccess file and string processing
Add to the .htaccess file (for Apache servers) the following lines:
1# default charset used by PHP 2php_value default_charset utf-8 3# encoding for mbstring 4php_value mbstring.internal_encoding utf-8 5php_value mbstring.func_overload 7
The first line sets the default charset for PHP, this setting can be made directly to php.ini.
Second and third line sets the mbstring (multi byte string) functions.
Using UTF-8, as I was saying earlier, 1 char != 1B, so errors may appear:
1$var = 'aşadar'; 2 3echo strlen($var).PHP_EOL; // 7 4echo strtoupper($var).PHP_EOL; // AşADAR 5 6// using mbstring functions 7echo mb_strlen($var).PHP_EOL; // 6 8echo mb_strtoupper($var).PHP_EOL; // AŞADAR
This is why we set the mbstring functions mode using the .htaccess file. Content entered through forms should be processed using mbstring functions, to avoid problems like in the earlier example.
The available functions are in the manual.
Coding old content
There are many ways to encode ISO 8859-1 content to UTF-8. A couple of ways of doing that with PHP are:
– iconv() function which converts from a format to another specified format:
1echo iconv("ISO-8859-1", "UTF-8", "Test");
– utf8_encode() function which converts from ISO 8859-1 to UTF-8:
1echo utf8_encode("Test");
What does the future bring?
The long-expected PHP6 will have native support for Unicode, so all the above tricks will be unnecessary. At the moment of writing this blog PHP 6 is 70.70% done, and with a little luck it will be ready in less then an year.
-
Along with globalization, the old ASCII code is no longer suitable. Consider that one day you have to develop a project in German, Russian or even Japanese, you could adapt the charset for each of these languages or you could simply develop using Unicode.
To use Unicode with MySQL UTF-8 can be used.
You must note that UTF-8 characters are variable in length and they are ASCII compatible. In ASCII 1 char = 1B, in UTF-8 1 char can be between 1 and 4 B.
UTF-8 charset and collation on the server
Character type in MySQL is dictated by charset.
To check if UTF-8 in installed on the server:
1SHOW CHARSET LIKE 'utf8';
or with information_schema
1SELECT * FROM `CHARACTER_SETS` WHERE CHARACTER_SET_NAME = 'utf8';
If the charset was found then we can continue.
Another element that appears with charset is collation, which it’s used for comparing strings at ordering.
To see what collations are available on the server:
1SHOW COLLATION WHERE CHARSET = 'utf8';
or with information_schema
1SELECT * FROM `COLLATIONS` WHERE CHARACTER_SET_NAME = 'utf8';
Collation are usually by language, for comparing strings with or without diacritics for example, or “bin” can be used with orders strings in binary mode, ie “A” is greater than “a” for example.
If no collation is specified, then the default one will be used.
UTF-8 and data bases
When creating a database you can specify the default charset to be used with all new tables for which there isn’t any charset specified.
For example:
1CREATE DATABASE db_name CHARACTER SET utf8 COLLATE utf8_romanian_ci;
Or for modifying the default charset for a data base which already exists:
1ALTER DATABASE db_name CHARACTER SET utf8 COLLATE utf8_romanian_ci;
UTF-8, tables and columns
For modifying tables which already exist ALTER TABLE must be used is used.
A table can have a default charset and collation, and each column can have it’s own charset and collation.
For more information about the table:
1SHOW CREATE TABLE tab;
To set a charset for an existing table:
1ALTER TABLE tab CHARSET = utf8 COLLATE = utf8_romanian_ci;
For modifying the charset of a VARCHAR(200) column is used:
1ALTER TABLE tab MODIFY c1 VARCHAR(200) CHARSET utf8 COLLATE utf8_romanian_ci;
String size
A “problem” that may arise is related to the size of the character, it’s size can be between 1 and 4B. That is why for measuring a string column (like varchar) you must use CHAR_LENGTH(str) instead of LENGTH().
A short exemple:
1SET @var = 'aşadar'; 2SELECT CHAR_LENGTH(@var) AS 'Char', LENGTH(@var) AS 'Length'; 3 4// The output is: Char = 6 and Length = 7 because ş is 2B
-
If your like me you prefer manuals in CHM format.
Unfortunately Zend Framework manual is only in .pdf and a little less obvious in HTML format.
Fortunately generating a format CHM manual is easy(really, it is).
The steps are:
-
Download and install HTML Help Workshop.
-
Download the Zend Framework manual in HTML format, the link is in bottom right, not very obvious I believe.
-
Open HTML Help Workshop.
-
File->Open and from the folder where the manual files are open htmlhelp.hhp
-
File->Complile
Done!
The compiled CHM manual is just few steps away!
-