blog.willem-jan.net

Welcome on my blog

On this blog I try to regularly talk about my experiences with development. To be honest, I tried this before, more than once... But this time it will be different!

This time I'm going to make a list of stuff I want to talk about, and I'm setting a weekly reminder in my agenda to write a blogpost and/or think about more subjects.

Latest posts

Sculpin Related Content Bundle2014-08-01

When I wrote the phpcr tutorial serie, I wanted to show the previous posts in the same serie. As this wasn't a feature Sculpin provided, I decided to write my own bundle to make this possible, that's how SculpinRelatedContentBundle started.

Altough I only finished it for I wrote the last post in the serie, I'm still happy that I wrote it. But it still took me 6 months to actually tag and release it...

Installation

Add the bundle to your dependencies in your sculpin.json and run sculpin install.

{
    "require": {
        "wjzijderveld/sculpin-related-content-bundle": "~1.0"
    }
}

To use the bundle you need to do 2 things, the first is configure each post where you want to show related content. You do this by defining the tags that relate to this content. I deliberately didn't choose to automatically couple tags to eachother, because I didn't want to show all posts about phpcr with all posts about phpcr.

So we will configure for each content_type which tags should be used to find related content.

title: Awesome content is awesome
tags: [phpcr, phpcr-tutorial]
related:
  posts_tags: [phpcr-tutorial]

But only configuring the content that is related to each other isn't enough, We should also show the related content somewhere!

Because I didn't want to restrict where I can show the related content, I assigned the related content to a global variable related_content. This way you can choose how and where to show the content.


{% if related_content|length %}
    <ul class="block related">
        {% for relatedSource in related_content %}
            <li><a href="{{ site.url }}{{ relatedSource.url }}">
                {{ relatedSource.title }}
            </a></li>
        {% endfor %}
    </ul>
{% endif %}

Yesterday I tagged version 1.0.0, about 15 minutes later I tagged 1.0.1 because I forgot to fix the only issue that prevented me from tagging 1.0.0 sooner, because you know... I just didn't think that far ahead yesterday :)

It probably doesn't work with all content_types available as I only tried it with post_tags, but it shouldn't be that hard to add support for other types.

Contributions are welcome!


Migrating to jackrabbit2014-02-17

This is the sixth and probably last post about PHPCR with Jackalope. In the previous posts we played with PHPCR using Jackalope Doctrine DBAL as storage backend. In this post we are going to migrate our data to Jackrabbit.

Jackrabbit is a Apache project that fully implements the JCR-170 and JCR-283 spec. We are going to implement it using Jackalope Jackrabbit.

Preperations

We first need to install the new Jackalope package.

$ require jackalope/jackalope-jackrabbit ~1.1
$ composer update

Since we installed it in november there has been quit a few releases (Symfony 2.4, Jackalope 1.1, etc), so we are updating those as well.

In the first post, when we setup our repository, we already prepared a bit on the migration, by making our config aware of the fact that we can use multiple backends.

Disclaimer: I haven't performed this action in production yet, so I can't guarantee this works for all cases. And as always: make a backup!

Exporting the data from MySQL

To do this migration, we will export the data from MySQL and import it again into Jackrabbit. To do this I added 2 commands from PHPCR-Utils to our console, the WorkspaceExportCommand and the WorkspaceImportCommand. I also added the WorkspaceQueryCommand to perform simple SQL2 queries, but this is purely for demo purposes.

The export will be a XML file with your complete workspace. Running the export is as simple as running the command with the target filename as an argument.

$ ./console phpcr:workspace:export tutorial.xml

You should now have a file tutorial.xml with your complete workspace. I included an example of my workspace in the git repository, I also included a formatted version for your convenience.

Updating our configuration

Before we're going to import our data in Jackrabbit, we need to update our configuration. And you'll need a running Jackrabbit server to import your data again. I'm not going into specifics how to install and run Jackrabbit, there are multiple options to do that. In this example I'll be using the standalone server.

First we are going to update the config.yml we created in the root of our project. In the jackalope section we add a section for jackrabbit, under the dbal section. In the new section we only need to configure the URL of our repository. And update the transport to jackalope-jackrabbit.

jackalope:
  transport: jackalope-jackrabbit
  [..]
  jackrabbit:
    url: http://localhost:8080/server

I've updated the cli-config.php in the git repository, similar to the one for DoctrineDBAL, but without the database connection.

Importing our data

Before we actually import our data, lets run a SQL2 query to check we actually changed our backend correctly and the workspace is empty.

$ ./console phpcr:workspace:query "SELECT * FROM [nt:unstructured]"

This should return only a single row, the rootNode. Now, let's import our data! To do this, we first need to recreate the custom nodeTypes we created in the previous post. After that, we can actually import our data.

$ ./console tutorial:create-nodetype
$ ./console phpcr:workspace:import tutorial.xml

You should get a message like "Successfully imported file "<project path>/tutorial.xml" to path "/" in workspace "default".

When we now rerun the query from above, we should have some results! With Jackrabbit now configured, you might play with some of the features Jackrabbit has built-in, like versioning.

That's a wrap

As I mentioned in the intro, this is probably the last post in this serie. I hope this serie helped some people to understand a bit what PHPCR is and how it can be used. If somebody would like me to create a more detailed example, or expand a bit more on a specific subject, let me know!


Creating custom NodeTypes2014-01-18

This post is the fifth in a serie about PHPCR with Jackalope. In the first post we've setup our repository, in the second we talked about reading and writing from and to our repository. The third post was about the QueryObjectModel and in the previous post we talked about NodeTypes and Mixins.

In this post we'll be creating our own NodeType.

Defining the NodeType

As we saw in the previous post, NodeTypes have certain properties that define how a NodeType behaves and property definitions that define what values are stored in the NodeType. To create our own NodeType, we need to tell our repository how our NodeType should behave.

Our NodeType should have a name which shouldn't clash with other NodeTypes. To be sure of that, we need to register a new namespace with our repository and use that namespace to name our NodeType.

$session->getWorkspace()->getNamespaceRegistry()
        ->registerNamespace('acme', 'http://acme.example.com/phpcr/1.0');

To define the actual NodeType we need to use the NodeTypeManager, that manager takes care of registering the NodeType with our repository. We need to pass a NodeTypeDefinition to NodeTypeManager::registerNodeType. We can build it from an array, from XML or create a copy from another NodeTypeDefinition instance.

In this case I went for the XML solution which has the benefit that you can easily distribute the NodeType so others can use it without the need for PHP.

<!-- customNodeTypes.xml -->
<?xml version="1.0" encoding="utf-8"?>
<nodeTypes>
    <nodeType
            name="acme:product" isMixin="false" isAbstract="false"
            isQueryable="true" hasOrderableChildNodes="true">
        <supertypes>
            <supertype>nt:base</supertype>
            <supertype>mix:title</supertype>
            <supertype>mix:referenceable</supertype>
        </supertypes>
        <propertyDefinition
                name="acme:rrpPrice"
                requiredType="decimal"
                declaringNodeType="acme:product"
                autoCreated="true"
                fullTextSearchable="false"
                mandatory="true"
                multiple="false"
                onParentVersion="COPY"
                protected="false"
                queryOrderable="true">
            <valueConstraints />
        </propertyDefinition>
        <propertyDefinition
                name="acme:media"
                requiredType="Reference"
                declaringNodeType="acme:product"
                autoCreated="false"
                fullTextSearchable="false"
                mandatory="false"
                multiple="true"
                onParentVersion="COPY"
                protected="false"
                queryOrderable="false">
            <valueConstraints>
                <valueConstraint>nt:resource</valueConstraint>
            </valueConstraints>
        </propertyDefinition>
    </nodeType>
</nodeTypes>

The definition above gives us a acme:product NodeType with 2 properties. Beside that, we also defined that it has 3 supertypes: nt:base, mix:title and mix:reference. For the acme:media property, we also added a valueConstraint.

Registering the NodeType

To actually register the nodeType, we only need a few lines of code as seen in the CreateNodeTypeCommand

$nodeTypesDocument = new \DOMDocument();
$nodeTypesDocument->load(__DIR__ . '/../Resources/data/customNodeTypes.xml');
$xpath = new \DOMXPath($nodeTypesDocument);
foreach ($xpath->query('//nodeType') as $nodeTypeElement) {
    $nodeType = new NodeType(
        new Factory(),
        $session->getWorkspace()->getNodeTypeManager(),
        $nodeTypeElement
    );
    $session->getWorkspace()
        ->getNodeTypeManager()
        ->registerNodeType($nodeType, true);
}
$session->save();

Sidenote: above code won't work correctly untill PR 203 has been merged.

Using the newly created NodeType

To use the new NodeType, we need to create a Node with the new NodeType as primaryType.

$rootNode->addNode('customNodeTypeNode', 'acme:product');

We new now try to save the product, we get an error stating that the property acme:rrpPrice doesn't have a default value. If we fill the price the node saves just fine. After that we can also add a property jcr:title, which is defined by our supertype mix:title. But when we also try to add a property foo with value bar, we get an exception that our NodeTypeDefinition doesn't allow to set a property foo.

By creating custom NodeTypes, you can create some structure in your data, while still having the flexibility from a schemaless storage.

The examples in this blogpost can be found on Github.


NodeTypes and Mixins2014-01-10

This post is the fourth in a serie about PHPCR with Jackalope. In the first post we've setup our repository, in the second we talked about reading and writing from and to our repository. In the previous post we played with the QOM and wrote our first queries.

In this post, we will be digging a bit deeper into PHPCR. We'll see how NodeTypes work and how you can combine some properties with the use of Mixins.

Introduction to NodeTypes

What are NodeTypes? The JCR specification defines it like this:

Node types are used to enforce structural restrictions on the nodes and properties in a workspace by defining for each node, its required and permitted child nodes and properties.

A repository can determine what NodeTypes are supported. In this case we will look at the types that Jackalope defines. For that, we use the Command provided by PHPCR-Utils. I added that to the console used by the

$ ./console phpcr:node-type:list

This will give you a list of the NodeTypes defined by Jackalope. If you would have defined custom NodeTypes, then those would show here as well. As you already can see in that list almost all of them have one or more SuperTypes. That means you can extend NodeTypes with one or more other NodeTypes.

But this list doesn't show everything we want to see, so I created a new info Command in the example that also shows PropertyDefinitions and ChildNodeDefinitions. That way we can get a better understanding what NodeTypes actually define.

$ ./console tutorial:info

In this case, we are going to look at the NodeType nt:nodeType the nodeType that is used to define custom NodeTypes.

nt:nodeType
  Supertypes:
    > nt:base
  PropertyDefinitions:
    > jcr:hasOrderableChildNodes (Boolean: Required)
    > jcr:isQueryable (Boolean: Required)
    > jcr:isMixin (Boolean: Required)
    > jcr:nodeTypeName (Name: Required)
    > jcr:isAbstract (Boolean: Required)
    > jcr:primaryItemName (Name: Optional)
    > jcr:supertypes (Name: Optional)
    > jcr:mixinTypes (Name: Optional)
    > jcr:primaryType (Name: Required)
  ChildNodeDefinitions:
    > jcr:childNodeDefinition (Optional)
    > jcr:propertyDefinition (Optional)

As you can see, a NodeType defines a lot of stuff. It defines that it should have a property that defines it's name jcr:nodeTypeName. It also defines if the NodeType is Queryable and if it's childNodes have an order. Besides that, it can also define if and which childNodes it can contain. As an example, NodeType nt:activity doesn't support any ChildNodes, nt:folder can contain any NodeType and nt:file has a required ChildNode jcr:content.

Mixin example

For an example with Mixins, I planned to use some versioning code examples to demonstrate how mixins can be used. But that was until I found out Versioning is not yet supported by Jackalope Doctrine DBAL. So I'm just going to explain what mixins can do for you, and I will use some of the other default mixins available.

A Mixin looks like a NodeType, it also defines some parameters of how a Node should behave. But different from the primary NodeType, you can add multiple mixins to a single node. Let's see how we add a Mixin to a Node first, in this case mix:created.

$mixinExample = $rootNode->addNode('mixinExample');
$mixinExample->addMixin('mix:created');
$session->save();

That's simple isn't it? Directly we can see what this mixin can do:

var_dump($mixinExample->getProperty('jcr:created')->getString());
// string(29) "2014-01-10T22:15:25.000+01:00"

So we now have a creation date without us adding it manually. We also got a jcr:createdBy property, which is empty by default. Let's add another mixin.

$mixinExample->addMixin('mix:lastModified');

That mixin provides us with a jcr:lastModified and a jcr:lastModifiedBy property. It differs per transport layer if the jcr:lastModified is automatically updated. Jackalope Doctrine DBAL doesn't do that this moment. But we can still set it manually.

That's it for this post. It took a lot longer then planned to finish this post, partly because of the holidays and partly because I stumbled on some bugs in Jackalope that I wanted to fix first.

I hope the next part of this blogpost will appear within the next 2 weeks.


PHPCR: Query Object Model2013-12-09

This post is the third in a serie about PHPCR with Jackalope. In the first post we've setup our repository, in the second we talked about reading and writing from and to our repository.

In this post, we will continue with the reading part, but this time by using queries. For this, PHPCR has multiple possibilities. On of them is the Query Object Model (QOM) with a Factory. At first this can seem a bit verbose, but I hope to make clear why that is. Besides the QOM, there is also the possibility to use JCR-SQL2 queries. A Query language very similar to SQL, which (in the case of Jackalope DoctrineDBAL) gets converted to the QOM. Jackalope Jackrabbit uses SQL2 to communicate with Jackrabbit. There are 2 other query methods (SQL1 and XPath), but I'm not showing those in this serie.

For the examples I use in this post, I created a fixture file to load. This is loaded in the new Command I added.

But before we can start with querying, we need to understand what we are querying and how we limit the results. With PHPCR you're not querying tables or collections in a way you might be used to with SQL or NoSQL databases. When you create a query you specify a selector (or multiple if the repository supports joins). So, when you want to query for all files, without knowing where they are in your tree, you simply use the following query.

$factory = $session->getWorkspace()->getQueryManager()->getQOMFactory();
$source = $factory->selector('file', 'nt:file');
$qom = $factory->createQuery($source);
$result = $qom->execute();

Or with JCR-SQL2

$result = $session->getWorkspace()->getQueryManager()
                  ->createQuery("SELECT * FROM [nt:file]",
                                QueryInterface::JCR_SQL2);

Already you can see the difference in verbosity of both methods. Because of this, you will probably use JCR-SQL2 when building your queries. But in this post I will the QOM Factory for the most examples because (IMHO) that shows a bit better what parts make up a query.

Update: Instead of working with the QOMFactory directly, you can also work with the more fluent QueryBuilder that's provided by PHPCR Utils.

$qomFactory = $session->getWorkspace()->getQueryManager()->getQOMFactory();
$queryBuilder = new \PHPCR\Util\QOM\QueryBuilder($qomFactory);
$queryBuilder
    ->from($qomFactory->selector('file', 'nt:file'))
    ->where($qomFactory->descendantNode('file', '/documents'))
    ->execute();

Basic conditions

In the fixture file, I created some news items which we are going to query. The items have the nodeType nt:unstructured so that's the what we are quering for. For the first example we are just retrieving all items, which are placed under /queryExamples/news.

We first need to define what selector we are going to use.

/** @var QueryObjectModelFactoryInterface $qomFactory */
$qomFactory = $session->getWorkspace()->getQueryManager()->getQOMFactory();
$source = $qomFactory->selector('news', 'nt:unstructured');

The first parameter is the name of the selector which you get to choose yourself, you might call it an alias. The second is the primary nodeType we are querying for, so in this case nt:unstructured.

Now we can define what columns we want to return in the results. For this example we only select the title.

$titleColumn = $qomFactory->column('news', 'title', 'title');

The first parameter is the selector from which we wan't to select something. The second parameter is the property we want to select and the third is the name we want to use in the column, like an alias.

Next, we create the query with a condition to limit the nodes by path. This is called the DescendantNodeConstraint.

$qom = $qomFactory->createQuery(
    $source,
    $qomFactory->descendantNode('news', '/queryExamples/news'),
    array(),
    array($titleColumn)
);

The first parameters defines the selector to query from, the second is the condition for the query. The next parameter defines the order of the results. The last parameter are the columns to select. The first parameter of the constraint tells which selector we want to user, the second is under what path we want to select nodes.

Now we are ready to execute the query and loop over he results.

$result = $qom->execute();
foreach ($result->getRows() as $newsItem) {
    echo $newsItem->getValue('title');
}

Now we have our first results, we are going to limit our newsItem based on author. This is a simple PropertyValue constraint. But because we now have 2 constraints, we need to alter our constraint to an AndConstraint and it becomes very clear that the QOM Factory can get very verbose.

$qom = $qomFactory->createQuery(
    // SelectorInterface
    $qomFactory->selector('news', 'nt:unstructured'),
    // AndInterface
    $qomFactory->andConstraint(
        // DescendantNodeInterface
        $qomFactory->descendantNode('news', '/queryExamples/news'),
        // ComparisonInterface
        $qomFactory->comparison(
            // PropertyValueInterface
            $qomFactory->propertyValue('news', 'jcr:author'),
            // 'jcr.operator.equal.to'
            QueryObjectModelConstantsInterface::JCR_OPERATOR_EQUAL_TO,
            // LiteralInterface
            $qomFactory->literal('foo')
        )
    ) // No orderings or columns for this example
);
echo count($qom->execute()->getRows()); // 2

This seems like a lot of code for a simple query, so I'll try to explain why that is. You already can see that there are different kind of constraints. You don't always compare 2 properties but, as already used, you might need to test if the nodes are descendants of a specific node, or if the node is a direct child of a given node. Or maybe you only need to query nodes that have a specific property. Now everything is an object, it's way easier to walk the query to be able toe execute it.

More operands

The next queries have a bit more constraints, and a constraint that you might expected to work differently. We are going to query for nodes under /queryExamples/news, where the author is NOT bar AND where the node does NOT have a categories property OR where the node contains a category foo. Then we order the result based on NodeName in descending order.

$qom = $qomFactory->createQuery(
    $source,
    // First and constraint
    $qomFactory->andConstraint(
        // Descendant constraint
        $qomFactory->descendantNode('news', '/queryExamples/news'),
        // Second and constraint
        $qomFactory->andConstraint(
            // Compare author
            $qomFactory->comparison(
                $qomFactory->propertyValue('news', 'jcr:author'),
                QueryObjectModelConstantsInterface::JCR_OPERATOR_NOT_EQUAL_TO,
                $qomFactory->literal('bar')
            ),
            // Or constraint
            $qomFactory->orConstraint(
                // Check for missing property categories
                $qomFactory->notConstraint(
                    $qomFactory->propertyExistence('news', 'categories')
                ),
                // Or check for category foo
                $qomFactory->comparison(
                    $qomFactory->propertyValue('news', 'categories'),
                    QueryObjectModelConstantsInterface::JCR_OPERATOR_EQUAL_TO,
                    $qomFactory->literal('foo')
                )
            )
        )
    ),
    // Order nodes based on name
    array($qomFactory->descending($qomFactory->nodeName('news')))
);

We now receive 2 rows, item2 and item1 in that order.

To keep in mind

Especially with Doctrine DBAL there are a few limitations with querying. So is comparing dates a bit tricky because the properties are stored in XML. A workaround would be to store the date as a numeric format, so it becomes a lot easier to compare and order it.

JCR-SQL2

Now let's have a quick look at the JCR-SQL2 variant of above query. We can just generate that from the above code with $qom->getStatement();

SELECT * FROM [nt:unstructured] AS news
WHERE (
    ISDESCENDANTNODE(news, [/queryExamples/news])
    AND (
        news.[jcr:author] <> 'bar'
        AND (
            NOT news.categories IS NOT NULL
            OR news.categories = 'foo'
        )
    )
)
ORDER BY NAME(news) DESC

As you can see, is that a lot shorter and probably more more readable then the QOM Factory version. But a JCR-SQL2 can get hard to read when you build it dynamicly, so it definitely isn't always the better solution.

Other constraints

We already saw quit a few constraints and I named a few others. Here is a quick overview of all the constraints that PHPCR defines.

  • And Combine 2 required constraints
  • ChildNode Check parent node
  • Comparison Compare 2 values, can be any operand
  • DescendantNode Check if path is under specified path
  • FullTextSearch Full text search
  • Not Inverse given constraint
  • Or Combine 2 optional constraints
  • PropertyExistence Check if the node has a property
  • SameNode Compare node with another node

NOTE Not all constraints are available in the Jackalope DoctrineDBAL transport, so is the SameNodeConstraint not yet implemented.

We might see some of these constraints begin used in future posts. But I guess most are pretty obvious in what they do.

Next post

In the next post, I'll tell more nodeTypes and mixins. I'll do that by using versionable as an example.