CS 100 (Learn) — CS 100 (Web) — Module 07
Previously we saw the special code
. It is known as a character entity reference. A character entity reference starts with an ampersand (&
) and ends with a semi-colon (;
).
Because the ampersand (&
) is used for character entities, to display an ampersand you should use the special entity &
Similarly, the angle brackets (<
, >
) used for tags can be displayed with <
and >
for less-than and greater-than.
Here is another example:
Jamie Salé commence à patiner à l'âge de 5 ans.
Jamie Salé commence à patiner à l'âge de 5 ans.
A collection of entities is available on this Online Chart.
Character entities are convenient for a few characters, but in a previous Module, we saw how the Unicode standard can be used to represent characters from languages all over the world.
To use a Unicode character in HTML, you add a number sign (#
) in front of the number within the entity wrapper.
For example, to display the happy face Unicode character 12852210 in HTML, you write 😊
(😊).
Hello 😊
Hello 😊
In practice, Unicode numbers are more often known by their HEX codes. To use a Unicode character hex code as an entity, add an extra x
after the number sign (#x
) to indicate that the code is in hex.
My six-year-old daughter's favourite emoji is 💩.
My six-year-old daughter's favourite emoji is 💩.
(The following is advanced content not required for this course)
If your text editor supports UTF-8, you might be able to just cut & paste a Unicode character and place it directly in your HTML file. You should make sure your text editor is properly saving in UTF-8.
If you use Unicode in your html this way, then you should add a <meta>
tag to your header with a charset
attribute indicating the HTML file is in UTF-8.
<head>
<meta charset="UTF-8">
If you are using a lot of Unicode (for example, writing in a non-english language) then this is usually the best approach.
However, this method is slightly more susceptible to display problems across different computers and web browsers. If you are only using a few Unicode characters it is usually much safer to explicitly add the code with a character entity.