Dynamic Javascript

Mar 25, 2011 at 6:01 PM

Hello,
how well does the library handle dynamic javascript, as in this google analytics example: 

<script type="text/javascript">
        var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
        document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
    </script>

    <script type="text/javascript">
        try {
            var pageTracker = _gat._getTracker("UA-7276348-1");
            pageTracker._trackPageview();
        } catch (err) { }
    </script>
  
Would the library be able to compile the javascript in the document.write section?
Regards. 

 

 

 

 

Coordinator
Mar 25, 2011 at 10:00 PM

Jurassic is not a browser - it doesn't have any built-in concept of a document object (or script tags for that matter).  Jurassic is an implementation of ECMA-262 edition 5 <http://en.wikipedia.org/wiki/ECMAScript>.  Of course you could define a document object, but if you wanted to work in all cases you'd have to implement the entire DOM <http://en.wikipedia.org/wiki/Document_Object_Model> and that way lies madness :-)


Mar 25, 2011 at 10:09 PM

Hello Paul,

I've already implemented most of the DOM so that madness aside, with a little tweeking will it accept statements like document.write?

Regards

Coordinator
Mar 26, 2011 at 11:12 PM

I'm finding that statement a little difficult to believe; even DOM level 0 is pretty huge, and they get progressively bigger in DOM level 1, 2 and 3 :-) But yes, if you have implemented the DOM then wiring up document.write is easy-peesy. See http://jurassic.codeplex.com/wikipage?title=Exposing%20a%20.NET%20class%20to%20JavaScript&referringTitle=Documentation for info on how to do it if your DOM implementation is in .NET.  If it's in javascript then it's as easy as:

document = { };
document.write = function(html) { /* your function goes here */ }

You can also check out how the built-in objects are implemented in the Jurassic source code e.g. the RegExp or JSON objects.

The real hard bit is actually implementing document.write.  Correct me if I'm wrong but I think you have to be able to handle invalid HTML that only becomes correct after multiple document.write calls - getting an HTML parser to accept that sounds tough.

Coordinator
Mar 26, 2011 at 11:36 PM

The basic flow should be something like the following:

  1. Create a Jurassic ScriptEngine object.
  2. Register the document and window objects on the ScriptEngine using SetGlobalValue.
  3. Read in the HTML, parsing as you go.
  4. If you encounter a <script> tag, stop parsing as HTML and just read text until you hit a </script>
  5. Execute the javascript using scriptEngine.Execute() - this may insert text into the HTML parser input stream (via document.write).
  6. Start parsing HTML again.

Note that there are probably a few deficiencies in Jurassic that will need correcting - one thing I can think of is the line numbers will be messed up when exceptions are thrown since I don't currently support setting a starting line number.  Another is that sometimes <script> tags have HTML comments in them (<script><!-- javascript --></script) - Jurassic doesn't have any way of stripping those.  Let me know if there are any more.

Jan 28, 2013 at 3:28 PM

Hi Paul,

First off, let me compliment you with the awesome work you did!

Next, I'm trying to do exactly what you describe here. You're right that creating a good parser for HTML that supports JavaScript is hard; I had some success in the past using EcmaScript.NET, but did a lot of ugly hacks to get things working. For various reasons I'm now attempting to do the same trick with Jurassic... so far without much success.

The problem is that I can't really get my head around how to bind properties / fields / methods to Jurassic, even though I've read all the documentation.

Things I did so far:

  1. Create a HTML5 parser that parses 'as you go' (suppose that's my madness :-) )
  2. Implemented W3C classes such as Window, Browser, Navigator, etc, etc.
  3. It seems like Window acts as a global scope in browsers, so I'm registering properties from window into the global scope using f.ex.: engine.Global.FastSetProperty("document", document, Jurassic.Library.PropertyAttributes.NonEnumerable);
  4. I've made every W3C class inherit from ObjectInstance and inherit DOM.Node from ObjectInstance as well. Since classes like 'body' don't seem have a constructor, I didn't make the prototype/constructor class described in the 'Exposing [...]' referenced above. I do expose all properties using JSProperty (see below) and call base.PopulateFunctions(); as described in the link

 

[JSProperty(Name="onafterprint", IsConfigurable=false, IsEnumerable=false)]
public FunctionInstance OnAfterprint
{
get { return GetAttribute<FunctionInstance>("onafterprint"); }     set { SetAttribute<FunctionInstance>("onafterprint"value); OnNotifyPropertyChanged("OnAfterprint"); } }

What happens is that the engine gives errors like 'The property 'onafterprint' is non-configurable.'....

Could you please explain what I'm doing wrong here?

Coordinator
Jan 28, 2013 at 9:34 PM

The configurable flag prevents the deletion of a property and it prevents modification of the property metadata.  I'm guessing you are trying to do the latter.  If you want to make your properties non-configurable then you need to change their value using SetValue (or the indexer).

It would help if you posted the code that actually has the error!

Jan 29, 2013 at 9:34 AM

I've attempted to make a minimum testcase. There's quite a bit going on here (mostly code), so bear with me.

The testcase in plain HTML is the following:

<html>
	<body> 
		<div id='a'><div id='b'>
			</div>
			<div id='c'>
			</div>
		</div>
		<script language="javascript">
			function Bar() {
				Foo();
			}
			Window.prototype.Foo = function() {
				if (!document.all) 
				{
					var a = document.all['a'];
					var ch = a.removeChild(a.firstChild); 
					document.getElementById('c').innerHTML = 'foo';
				}
			}
			window.onload = Bar;
		</script>
	</body>
</html>

 

What happens here are a couple of different things:

  1. I bind a function 'Bar' to the window.onload event - where the Window is a .NET class
  2. Further, I add a function 'Foo' to the prototype of Window. Since Window is the global object in browsers, this is the same as making 'Foo' a global function
  3. Foo is used in Bar and called from the onload.
  4. document.all is a strange thing: if you ask if it exists, it returns undefined - but once you use it, it's suddenly there (sigh) and acts as document.getElementById
  5. I do some simple DOM operations and finally set the inner HTML of (window.)document.innerHTML to some text

Sounds easy enough, so let's do this in .NET shall we? :-)

I've created a bunch of classes that sort-of act like a DOM tree here. It's very simple, but you should be able to run it.

 

    /// <summary>
    /// The root 'node' element. I've made this simple, but normally there's Element, HTMLElement, and so on.
    /// </summary>
    public class Node : ObjectInstance
    {
        public Node(ScriptEngine engine) : base(engine) { base.PopulateFunctions(); }
        public List<Node> Children = new List<Node>();

        public IEnumerable<Node> EnumerateRecursive()
        {
            foreach (var item in Children)
            {
                yield return item;
                foreach (var jtem in item.EnumerateRecursive())
                {
                    yield return jtem;
                }
            }
        }

        [JSProperty(Name = "id", IsConfigurable = true)]
        public string Id { get; set; }

        [JSFunction(Name = "removeChild")]
        public void RemoveChild(Node ch)
        {
            Children.Remove(ch);
        }

        [JSProperty(Name = "firstChild", IsConfigurable = true)]
        public Node FirstChild
        {
            get { return Children.FirstOrDefault(); }
            set { throw new NotImplementedException(); }
        }

        [JSProperty(Name = "innerHTML", IsConfigurable = true)]
        public string InnerHTML
        {
            get { throw new NotImplementedException(); }
            set { Children.Add(new Text(Engine, value)); }
        }
    }

    public class Text : Node
    {
        public Text(ScriptEngine engine, string val) : base(engine) { base.PopulateFunctions(); this.Value = val; }

        public string Value { get; set; }
    }

    public class Document : Node
    {
        public Document(ScriptEngine engine) : base(engine) { base.PopulateFunctions(); }

        [JSProperty(Name = "all", IsConfigurable = true, IsEnumerable = false)]
        public NodeCollection All { get { return new NodeCollection(Engine, EnumerateRecursive().ToArray()); } }

        [JSFunction(Name = "getElementById")]
        public Node GetElementById(string id)
        {
            return EnumerateRecursive().FirstOrDefault((a) => (a.Id == id));
        }
    }

    public class Body : Node
    {
        public Body(ScriptEngine engine) : base(engine) { base.PopulateFunctions(); }
    }

    public class Div : Node
    {
        public Div(ScriptEngine engine) : base(engine) { base.PopulateFunctions(); }
    }


 

Most of it is more or less the same. I'm a bit concerned that PopulateFunctions will be called multiple times down the inheritance tree, but since I didn't get too many errors there I guess that's okay... The first strange thing you will encounter is the 'all', which I made non-enumerable, but probably needs something strange; to be honest I'm not really sure what to do with that at the moment.

Either way, 'all' returns a NodeCollection that basically does a 'getElementById':

 

    public class NodeCollection : ObjectInstance
    {
        public NodeCollection(ScriptEngine engine, Node[] node)
            : base(engine)
        {
            this.node = node;
        }

        private Node[] node;

        protected override object GetMissingPropertyValue(string propertyName)
        {
            return node.FirstOrDefault((a) => (a.Id == propertyName));
        }
    }

 

And last but not least we need the Window object, which is sort-a the global object.

    public class Window : ObjectInstance
    {
        public Window(ScriptEngine engine)
            : base(engine)
        {
            base.PopulateFunctions();
        }

        [JSProperty(Name = "document", IsConfigurable = true)]
        public Node Document { get; set; }

        [JSProperty(Name = "onload", IsConfigurable = true)]
        public FunctionInstance OnLoad { get; set; }

        internal void RegisterGlobals()
        {
            Engine.Global.DefineProperty("Window", new PropertyDescriptor(this, PropertyAttributes.Sealed), true);
            Engine.Global.DefineProperty("document", new PropertyDescriptor(null, PropertyAttributes.Sealed), true);
            Engine.Global.DefineProperty("onload", new PropertyDescriptor(null, PropertyAttributes.Sealed), true);
        }
    }

For the window object, I'm struggling with the globals. In the ideal case I would make 'Window' the new global, so probably at some point I'll just take the Jurassic.GlobalObject class and inherit from that and make a new constructor... or something like that. After all, it seems that Window is the global object in browsers, so it seems only fair to treat it like one.

Last in my minimal test case is the unit test itself. I use the Microsoft unit testing framework, but any framework will do I suppose. I try to mimic what happens in the browser here and test if the result is correct:

    [TestClass]
    public class JavascriptTest
    {
        [TestMethod]
        public void Test()
        {
            ScriptEngine engine = new ScriptEngine();

            // Create the DOM tree
            Window wnd = new Window(engine)
            {
                Document = new Document(engine)
                {
                    Children = new List<Node>() 
                    { 
                        new Body(engine) 
                        { 
                            Children = new List<Node>() 
                            { 
                                new Div(engine) 
                                { 
                                    Id = "a",
                                    Children = new List<Node>() {
                                        new Div(engine) { Id = "b" },
                                        new Div(engine) { Id = "c" }
                                    }}}}}
                }
            };

            // Initialize globals
            wnd.RegisterGlobals();

            // Create the evil javascript
            string javascript = @"
function Bar() {
	Foo();
}
Window.prototype.Foo = function() {
	if (!document.all) 
	{
		var a = document.all['a'];
		var ch = a.removeChild(a.firstChild); 
		document.getElementById('c').innerHTML = 'foo';
	}
}
window.onload = Bar;
";
            engine.Execute(javascript);
            wnd.OnLoad.Call(wnd);

            // Check if the dom tree is what we expect it to be
            Assert.AreEqual(1, wnd.Document.Children.Count); // body

            Assert.AreEqual(1, wnd.Document.Children[0].Children.Count); // div 'a'
            Assert.AreEqual("a", wnd.Document.Children[0].Children[0].Id);
            Assert.IsInstanceOfType(wnd.Document.Children[0].Children[0], typeof(Div));
            var div = wnd.Document.Children[0].Children[0];

            // we removed b so there's only c remaining in the div
            Assert.AreEqual(1, div.Children.Count); // div 'c'
            Assert.AreEqual("c", div.Children[0].Id);
            Assert.IsInstanceOfType(div.Children[0], typeof(Div));

            // and c has a child which is a text
            div = div.Children[0];
            Assert.AreEqual(1, div.Children.Count); // text
            Assert.IsInstanceOfType(div.Children[0], typeof(Text));
        }
    }

As you can see there's still a bunch of things going wrong, but this should give you an idea of what I'm trying to achieve.

GlobalObject